per 



WORLD INTELLECTVAL PROPERTY ORGANIZATION 
wuiu^ IntCTOBtional Bureau 



(51) International Patent OasslficatiiMi 5 

C120 1/68, C12P 19/34 
rJiiN 33^. 33/566. C07H 15/12 



(JDIntemationalAppUcattonN-nber: PCr/US91/04078 
(22)Inten.atlonalEiItegD.te: 10 June 1991 (10.06.91) 



nO) Priori^ data: 
536,428 



Al 



(11) Inteniational PubUcation Nnmber: WO 91/19813 

(43) intenuitional Pobllcatfon Date: 26 December 1991 (26.1251) 



11 June 1990(11.06.90) 



US 



gSS' l^^SmS%o. 1140. Boulder. CO 
80306 (US). 

^^xr * « nnm Tjirrv • 955 - 8th Street, Boulder, CO 

^^"^m lm^^^i' 3100 Do-^' 

der. CO 80303 (US). 



(74) Agents: SWANSON. Bar^.^J- « «»•; 4582 S. Ulster St 
^ Pkwy., #403, Denver, CO 80237 (US). 

(81) Designated States: AT, AT gBuropean Paten4 ^m%AM 
^ (European patent^ BF (OAPI P^^irt). BG, BJ^API 

patent); BRTcA CF (OAPI P^^^'^^^pJ^^lTfii 
CH, CH (European patent). CI (OAPI pat«aO. CM 
(OAPI patent), DE, DE (Eimipean Pate"*). ^ 
(Europe^ patent), ES, ES (European Pf^nO, Ft 
KSean ^tent), GA (OAPI patent^ GB. GB (Euro- 

Sr». GN (OAPI P-*«f)/pG^(^^,?^- 
tpnt^ HU IT (European patent), JP, KP, luc, i-Js^ i^^, 
^S V^pean^ patent), MC, MG Ml, (OAPI pa en^ 
MR (OAPI patent), MW. NL. NL (Enropesm p^ng. 
NO PL. RO: SD. SB, SB (European patent), SN (OAPI 
SU. -nMOAPI v^Umt^TO (OAPl patent). 

Pnblished 

mth intemaiional seaidi report. ^„Ah«, tht 

BefoK the aqnraOon of the time bnat far okk'^ f'^^ 
mdtabe repubUshed inAeefenttffAe recevt 

amendments. 



e4)lltle: NUCLEIC ACID UGANDS 



A A 
O C 
-30 .A O 
A C 

U-6 
C-G 
C-G 

>40 S-C»-20 -10 0 
. uaauauauCAA6^"^AnAWyCC»iGGAAUaucuaug 



(57) Abstract 

.newc.ssofnuc.eicacidco.p™.n.^^rJe^^^ 
binding affinity for three dimensiond n»le«il^^^ SenS^ o^the Systematic Evolution of Ugands by Exponendal en- 
'S^S^^^r^ a'-SSal "r^re^^fCcllic Sa^ iterative^ enriched in high affinity nudetc acds and amph- 
fied for further partitioning. 



BEST WAiUBL&€OPT 



«M me PURPOSES op nmyiaiAnoN only 

applicaSiT'ulS tl;i?CT^ ^ party U» .he PCX on the ftom pages of pamphlets publishing i„.er««ional 



AT 


Austria 


AU 


Australia 


B8 


Barbados 


BB 


Bej£^um 


BP 


Burkina Faso 


BG 


Bulgaria 


aj 


Benin 


BB 


Brazil 


CA 


Canada 


CP 


Central African {^public 


CC 


Congo 


CH 


Switzerland 


a 


Cote d'lvoirtt 


CM 


Canteroon 


cs 


Oechoslovakia 


DE 


Gennany 


OK 


Denmark 



ES 


Spain 


PI 


Finland 


PR 


France 


CA 


Gabon 


CB 


United Kin^om 


CN 


Guinea 


GR 


Greece 


HU 


Hungary 


IT 


Italy 


JP 


Japan 


KP 


Democratic People's Republic 




of Korea 


KR 


Republic of Korea 


U 


Ucchtcnstein 


LK 


Sri Lanka 


LU 


Luxembourg 


MC 


Monaco 



MG 


Maifaiggwcar 


ML 


MaH 


MN 


Mong^ia 


MR 


Mauritania 


MW 


Malawi 


NL 


Nctberiands 


NO 


Norway 


PL 


Poland 


RO 


Romania 


SD 


Sudan 


SB 


Sweden 


SN 


Senega! 


SU 


Soviet Union 


TD 


Chad 


TC 


Togo 


US 


United States of America 



1 

NUCLEIC ACID LI6ANDS 

This application is a Continua'tion-in-Part: of 
United States Patent Application Serial No. 07/536,428, 
filed June 11, 1990, entitled Systematic Evolution of 
Ligands By Exponential Enrichment. 

This work was supported by grants from the 
United States Government funded through the National 
Institutes of Health. The U.S. Government has certain 
rights in this invention. 

FIELD OF THE INVENTION 

We describe herein a new class of high-affinity 
nucleic acid ligands that specifically bind a desired 
target molecule. A method is presented for selecting a 
nucleic acid ligand that specifically binds any desired 
target molecule. The method is termed SELEX, an 
acronym for Systematic Evolution of Ligands by 
Exponential enrichment. The method of the invention 
(SELEX) is useful to isolate a nucleic acid ligand for 
a desired target molecule. The nucleic acid products 
of the invention are useful for any purpose to which a 
binding reaction may be put, for example in assay 
methods, diagnostic procedures, cell sorting, as 
inhibitors of target molecule function, as probes, as 
sequestering agents and the like. In addition, nucleic 
acid products of the invention can have catalytic 
activity. Target 

molecules include natural and synthetic polymers, 
including proteins, polysaccharides, glycoproteins, 
hormones, receptors and cell surfaces, and small 
molecules such as drugs, metabolites, cof actors, 
transition state analogs and toxins. 

BACKGROUND OF THE INVENTION 

Most proteins or small molecules are not known 
to specifically bind to nucleic acids. The known 
protein exceptions are those regulatory proteins such 



SUBSTITUTE SHEEl 



as repressors, polymerases, activators and the like 
which function in a living cell to bring about the 
transfer of genetic information encoded in the nucleic 
acids into cellular struct\ires and the replication of 
the genetic material. Furthermore, small molecules 
such as GTP bind to some intron SNAs. 

Living matter has evolved to limit the function 
of nucleic acids to a largely informational role. The 
Central Dogma, as postulated by Crick, both originally 
and in expemded form, proposes that nucleic acids 
(either RNA or DNA) can serve as templates for the 
synthesis of other nucleic acids through replicative 
processes that "read" the information in a template 
nucleic acid and thus yield complementary nucleic 
acids. All of the experimental paradigms for genetics 
and gene expression depend on these properties of 
nucleic acids: in essence, double-stranded nucleic 
acids are informationally redundant because of the 
chemical concept of base pairs and because replicative 
processes are abl& to use that base pairing in a 
relatively error- free manner. 

The individual con^onents of proteins, the 
twenty natural amino acids, possess sufficient chemical 
differences and activities to provide an enormous 
breadth of activities for both binding and catalysis. 
Nucleic acids, however, are thought to have narrower 
chemical possibilities than proteins, but to have an' 
informational role that allows genetic information to 
be passed from virus to virus, cell to cell, and 
organism to organism. In this context nucleic acid 
coiE^nents, the nucleotides, must possess only pairs of 
surfaces that allow informational redundancy within a 
Watson-Crick base pair. Nucleic acid components need 
not possess chemical differences and activities 
sufficient for either a wide range of binding or 
catalysis . 

However, some nucleic acids found in nature do 



SUBSTITUTE SHESl 



participate in binding to certain target molecules and 
even a few instances of catalysis have been reported. 
The range of activities of this kind is narrow compared 
to proteins and more specifically antibodies. For 
example, where nucleic acids are known to bind to some 
protein targets with high affinity and specificity, the 
binding depends on the exact sequences of nucleotides 
that comprise the DNA or RNA ligand. Thus, short 
double-stranded DNA sequences are ]aiown to bind to 
target proteins that repress or activate transcription 
in both prokaryotes and eukaryotes. Other short 
double-stranded DNA sequences are known to bind to 
restriction endonucleases, protein targets that can be 
selected with high affinity and specificity. Other 
short DNA sequences serve as centromeres and telomeres 
on chromosomes, presumably by creating ligands for the 
binding of specific proteins that participate in 
chromosome mechanics. Thus, double-stranded DNA has a 
well-known capacity to bind within the nooks and 
crannies of target proteins whose functions are 
directed to DNA binding. Single-stranded DNA can also 
bind to some proteins with high affinity and 
specificity, although the number of examples is rather 
smaller. From the known examples of double-stranded 
DNA binding proteins, it has become possible to 
describe the binding interactions as involving various 
protein motifs projecting amino acid side chains into 
the major groove of B form double-stranded DNA, 
providing the sequence inspection that allows 
specificity. 

Double-stranded RNA occasionally serves as a 
ligand for certain proteins, for example, the 
endonuclease RNase III from E. coli. There are more 
known instances of target proteins that bind to single- 
stranded RNA ligands, although in these cases the 
single-stranded RNA often forms a complex three- 
dimensional shape that includes local regions of 



SUBSTITUTE SKEEl 



wo 91/19813 PCT/US91/04078 



4 

intramolecular double-strandedness . The amino-acyl 
tRNA synthetases bind tightly to tRNA molecules with 
high specificity. A short region within the genomes of 
RNA viruses binds tightly and with high specificity to 
5 the viral coat proteins. A short sequence of RNA binds 

to the bacteriophage T4-encoded DNA polymerase, again 
with high affinity and specificity. Thus, it is 
possible to find RNA and DNA ligands, either double- or 
single-stranded, serving as binding partners for 
10 specific protein targets. Most known DNA binding 

proteins bind specifically to double-stranded DNA, 
while most RNA binding proteins recognize single- 
stranded RNA. This statistical bias in the literature 
no doubt reflects the present biosphere's statistical 
15 predisposition to use DNA as a double-stranded genome 

and RNA as a single-stranded entity in the many roles 
RNA plays beyond serving as a genome. Chemically there 
is no strong reason to dismiss single-stranded DNA as a 
fully able partner for specific protein interactions. 
^° and DNA have also been found to bind to 

smaller target molecules. Double-stranded DNA binds to 
various antibiotics, such as actinomycin D. a specific 
single-straiu3ed RNA binds to the antibiotic 
thiostreptone; specific RNA sequences and structures 
probably bind to certain other antibiotics, especially 
those whose functions is to inactivate ribosomes in a 
target organism. A family of evolutionarily related" 
RNAs binds with specificity and decent affinity to 
nucleotides and nucleosides (Bass, b. and Cech, T. 
30 (1984) Nattire 3M: 820-826) as well as to one of the 

twenty amino acids (Yarus, M. (1988) Science 243:1751- 
1758) . Catalytic RNAs are now known as well, although 
these molecules perform over a narrow range of chemical 
possibilities, which are thus far related largely to 
35 phosphodiester transfer reactions and hydrolysis of 

nucleic acids. 

Despite these known instances, the great 



25 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04a78 



5 

majority of proteins and o1:ber cellular coiDponen1:s are 
thought not to bind to nucleic acids under 
physiological conditions and such binding as may be 
observed is non-specific. Either the capacity of 
5 nucleic acids to bind other compounds is limited to the 

relatively few instances enumerated supra , or the 
chemical repertoire of the nucleic acids for specific 
binding is avoided (selected against) in the structures 
that occur naturally. The present invention is 

10 premised on the inventors* fundamental insight that 

nucleic acids as chemical compounds can form a 
virtually limitless array of shapes, sizes and 
configurations, and are capable of a far broader 
repertoire of binding and catalytic functions than 

15 those displayed in biological systems. 

The chemical interactions have been explored in 
cases of certain known instances of protein-nucleic 
acid binding. For example, the size and sequence of 
the RNA site of bacteriophage R17 coat protein binding 

20 has been identified by Uhlenbeck and coworkers. The 

minimal natural RNA binding site (21 bases long) for 
the R17 coat protein was determined by subjecting 
variable-sized labeled fragments of the mRNA to 
nitrocellulose filter binding assays in which protein- 

25 RNA fragment complexes remain bound to the filter 

(Carey et al . (1983) Biochemistry 22.: 2601). A number 
of sequence variants of the minimal R17 coat protein 
binding site were created in vitro in order to 
determine the contributions of individual nucleic acids 

30 to protein binding (Uhlenbeck et al . (1983) J. Biomol. 

Structure Dynamics i:539 and Romaniuk et al . (1987) 
Biochemistry 26:1563). It was found that the 
maintenance of the hairpin loop structure of the 
binding site was essential for protein binding but, in 

35 addition, that nucleotide substitutions at most of the 

single-stranded residues in the binding site, including 
a bulged nucleotide in the hairpin stem, significantly 



SUBSTITUTE SKE5T 



affected binding. In similar studies, the binding of 
bacteriophage Qfi coat protein to its translational 
operator was examined (Witherell and Uhlenbeck (1989) 
Biochemistry 28:71). The Qi9 coat protein RNA binding 
site was found to be similar to that of R17 in size, 
and in predicted secondary structxire, in that it 
comprised about 20 bases with an 8 base pair hairpin 
structure which included a bvaged nucleotide and a 3 
base loop, in contrast to the Ri? coat protein binding 
site, only one of the single-stranded residues of the 
loop is essential for binding and the presence of the 
bulged nucleotide is not required. The protein-RNA 
binding interactions involved in translational 
regtilation display significant specif ity. 

Nucleic acids axe known to form secondary and 
tertiary structures in solution. The double-stranded 
forms of DNA include the so-called B double-helical 
form, z-DNA and superhelical twists (Rich, A. et al. 
(1984) Ann. Rev. Biochem. £1:791-846). Single-stranded 
RNA forms localized regions of secondary structure such 
as hairpin loops and pseudoknot structures (Schimmel, 
P. (1989) cell M:9-12). However, little is known 
concerning the effects of unpaired loop nucleotides on 
stability of loop structure, kinetics of formation and 
denaturation, thermodynamics, and almost nothing is 
known of tertiary structures and three dimensional 
shape, nor of the kinetics and thermodynamics of 
tertiary folding in nucleic acids (Tuerk, C. et al. 
(1988) Proc. Natl. Acad. Sci. USA 85: 1364-1368) . 

A type of in vitro evolution was reported in 
replication of the RNA bacteriophage Q/8. Mills, D.R. 
et al. (1967) Proc. Natl. Acad. Sci USA 51:217-224; 
Levinsohn, R. and Spiegleman, S. (1968) Proc. Natl. 
Acad. sci. USA ^: 866-872; Levisohn, R. and Spiegelman 
S. (1969) Proc. Natl. Acad. Sci. USA 51:805-811; 
Saffhill, R. et al. (1970) J. Mol. Biol. 51:531-539; 
Kacian, D.L. et al. (1972) Proc. Natl. Acad. Sci. USA 



SUBSTITUTE SHEE'I 



62:3038-3042; Mills^ D,R. et al, (1973) Science 
220:916-927. The phage RNA serves as a poly-cistronic 
messenger HNA directing translation of phage-specific 
proteins and also as a template for its own replication 
catalyzed by Qp RNA replicase. This RNA replicase was 
shown to be highly specific for its own RNA templates. 
During the course of cycles of replication in vitro 
small variant RNAs were isolated which were also 
replicated by Q/3 replicase. Minor alterations in the 
conditions under which cycles of replication were 
performed were found to result in the accumulation of 
different RNAs, presumably because their replication 
was favored under the altered conditions. In these 
experiments, the selected RNA had to be bound 
efficiently by the replicase to initiate replication 
and had to serve as a kinetically favored template 
during elongation of RNA. Kramer et al . (1974) J. Mol. 
Biol. 89:719 reported the isolation of a mutant RNA 
template of Qfi replicase, the replication of which was 
more resistant to inhibition by ethidium bromide than 
the natural template. It was suggested that this 
mutant was not present in the initial RNA population 
but was generated by sequential mutation during cycles 
of in vitro replication with Qp replicase. The only 
source of variation during selection was the intrinsic 
error rate during elongation by QP replicase. In these 
studies what was termed '^selection** occurred by 
preferential amplification of one or more of a limited 
number of spontaneous variants of an initially 
homogenous RNA sequence. There was no selection of a 
desired result, only that which was intrinsic to the 
mode of action of Qfi replicase. 

Joyce and Robertson (Joyce (1989) in RNA: 
Catalysis, Splicing > Evolution , Bel fort and Shub 
(eds.)/ Elsevier, Amsterdam pp. 83-87; and Robertson 
and Joyce (1990) Nature 344 ;467) reported a method for 
identifying RNAs which specifically cleave single- 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/l)S91/04078 



8 



15 



stranded DNA. The selection for catalytic activity was 
based on the ability of the ribozyme to catalyze the 
cleavage of a substrate ssRNA or DNA at a specific 
position and transfer the 3 '-end of the substrate to 
5 the a '-end of the ribozyme. The product of the desired 

reaction was selected by using an oligodeoxynucleotide 
primer which could bind only to the completed product 
across the junction formed by the catalytic reaction 
and allowed selective reverse transcription of the 
10 ribozyme sequence. The selected catalytic sequences 

were amplified by attachment of the promoter of T7 RMA 
polymerase to the 3 '-end of the cDNA, followed by 
transcription to RNA. The method was employed to 
identify from a small number of ribozyme variants the 
variant that was most reactive for cleavage of a 
selected substrate. Only a limited array of variants 
was testable, since variation depended upon single 
nucleotide changes occurring during amplification. 

The prior art has not taught or suggested more 
than a limited range of chemical functions for nucleic 
acids in their interactions with other substances: as 
targets for protein ligands evolved to bind certain 
specific olighocleotide sequences; more recently, as 
catalysts with a limited range of activities. Prior 
25 "selection" experiments have been limited to a narrow 

range of variants of a previously described function. 
Now, for the first time, it will be understood that the 
nucleic acids are capable of a vastly broad range of 
functions and the methodology for realizing that 
30 capability is disclosed herein. 

SUMMARY OF THK TNVENTTOW 

The present invention provides a class of 
products which are nucleic acid molecules, each having 
35 a unique sequence, each of which has the property of 

binding specifically to a desired target compound or 
molecule. Each compound of the invention is a specific 



20 



SUBSTITUTE SHEEl 



ligand of a given target molecule. The invention is 
based on the unique insight that nucleic acids have 
sufficient capacity for forming a variety of two- and 
three-dimensional structures and sufficient chemical 
versatility available within their monomers to act as 
ligands (form specific binding pairs) with virtually 
any chemical compound, whether monomer ic or polymeric. 
Molecules of any size can serve as targets. Most 
commonly, and preferably, for therapeutic applications, 
binding takes place in aqueous solution at conditions 
of salt', temperature and pH near acceptable 
physiological limits. 

The invention also provides a method which is 
generally applicable to make a nucleic acid ligand for 
any desired target. The method involves selection from 
a mixture of candidates and step-wise iterations of 
structural improvement, using the same general 
selection theme, to achieve virtually any desired 
criterion of binding affinity and selectivity. 
Starting from a mixture of nucleic acids, prefercd3ly 
comprising a segment of remdomized sequence, the 
method, termed SELEX herein, includes steps of 
contacting the mixture with the target under conditions 
favorcQ>le for binding, partitioning unbound nucleic 
acids from those nucleic acids which have bound to 
target molecules, dissociating the nucleic 
acid-target pairs, amplifying the nucleic acids 
dissociated from the nucleic acid-target pairs to yield 
a ligemd-enriched mixture of nucleic acids, then 
reiterating the steps of binding, partitioning, 
dissociating and amplifying through as many cycles as 
desired. 

While not bound by a theory of preparation, 
SELEX is based on the inventors' insight that within a 
nucleic acid mixture containing a large number of 
possible sequences and structures there is a wide range 
of binding affinities for a given target. A nucleic 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/l691/fM078 



10 



acid mixture comprising, for example a 20 nucleotide 
randomized segment can have 4^ candidate 
possibilities! Those which have the higher affinity 
constants for the target are most likely to bind. 
5 After partitioning, dissociation and amplification, a 

second nucleic acid mixture is generated, enriched for 
the higher binding affinity candidates. Additional 
rounds of selection progressively favor the best 
ligands until the resulting nucleic acid mixture is 
10 predominantly composed of oiay one or a few sequences. 

These can then be cloned, sequenced and individually 
tested for binding affinity as pure ligands. 

Cycles of selection and amplification are 
repeated until a desired goal is achieved, in the most 
15 general case, selection/amplification is continued 

xintil no significant improvement in binding strength is 
achieved on repetition of the cycle. The iterative 
selection/ amplification method is sensitive enough to 
allow isolation of a single sequence variant in a 
20 mixture containing at least 65,000 sequence variants. 

The method is even capable of isolating a small number 
of high affinity sequences in a mixture containing lo^* 
sequences. The method could, in principle, be used to 
sample as many as about 10^* different nucleic acid 
25 species. The nucleic acids of the test mixture 

preferably include a randomized sequence portion as 
well as conserved sequences necessary for efficient • 
amplification. Nucleic acid sequence variants can be 
produced in a number of ways including synthesis of 
randomized nucleic acid sequences and size selection 
from randomly cleaved cellular nucleic acids. The 
variable sequence portion may contain fully or 
partially random sequence; it may also contain 
subportions of conserved sequence incorporated with 
35 randomized sequence. Sequence variation in test 

nucleic acids can be introduced or increased by 
mutagenesis before or during the 



30 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/US91/04078 



11 

selection/ amplification iterations • 

In one exobodiment of the present invention, the 
selection process is so efficient at isolating those 
nucleic acid ligands that bind most strongly to the 
5 selected target, that only one cycle of selection and 

amplification is required. Such an efficient selection 
may occur ^ for example, in a chromatographic-type 
process wherein the ability of nucleic acids to 
associate with targets bound on a column operates in 
10 such a msmner that the column is sufficiently able to 

allow separation and isolation of the highest affinity 
nucleic acid ligands. 

In many cases, it is not necessarily deslretble 
to perform the iterative steps of SELEX until a single 
15 nucleic acid ligand is identified. The target-specific 

nucleic acid ligand solution may include a family of 
nucleic acid structures or motifs that have a number of 
conserved sequences and a number of sequences which can 
be substituted or added without significantly effecting 
20 the affinity of the nucleic acid ligands to the target. 

By terminating the SELEX process prior to completion, 
it is possible to determine the sequence of a number of 
members of the nucleic acid ligand solution family, 
which will allow the determination of a comprehensive 
25 description of the nucleic acid ligand solution. 

After a description of the nucleic acid ligand 
family has been resolved by SELEX, in certain cases it 
may be desirable to perform a further series of SELEX 
that is tailored by the information received during the 
30 SELEX experiment. In one embodiment, the second series 

of SELEX will fix those conserved regions of the 
nucleic acid ligand family while randomizing all other 
positions in the ligand structure. In an alternate 
embodiment, the sequence of the most representative 
35 member of the nucleic acid ligand family may be used as 

the basis of a SELEX process wherein the original pool 
of nucleic acid sequences is not completely randomized 



SUBSTITUTE SHEEl 



wo 91/19813 

PCT/l^]/040ra 

12 

but contains biases towards the best known ligand. By 
these methods it is possible to optimize the SELEX 
process to arrive at the most preferred nucleic acid 
ligands. 

5 A variety of nucleic acid primary, secondary 

and tertiary structures are known to exist. The 
structures or motifs that have been shown most commonly 
to be involved in non-Watson-Crick type interactions 
are referred to as hairpin loops, symmetric and 
10 asymmetric bulges, psuedoknots and myriad combinations 

of the same. Almost all known cases of such motifs 
suggest that they can be formed in a nucleic acid 
sequence of no more than 30 nucleotides. For this 
reason, it is preferred that SELEX procedures with 
15 contiguous randomized segments be initiated with 

nucleic acid sequences containing a randomized segment 
of between about 20-50 nucleotides, and in the most 
preferred embodiments between 25 and 40 nucleotides. 
Thxs invention includes solutions comprising a mixture 
20 Of between about lo' to lo^» nucleic acid sequences 

havxng a contiguous randomized sequence of at least 
about 15 nucleotides in length, m the preferred 
embodiment, the randomized section of sequences is 
flanked by fixed sequences that facilitate the 
25 amplification of the ligands. 

m the case of a polymeric target, such as a 
protein, the ligand affinity can be increased by 
applying SELEX to a mixture of candidates comprising a 

"""'^'^ randomized sequence. 

The sequence of the first selected ligand associated 
wxth binding or subportions thereof can be introduced 
xnto the randomized portion of the nucleic acids of a 
second test mixture. Th^ sELEX procedure is repeated 
wxth the second test mixture to isolate a second 
nucleic acid ligand, having two sequences selected for 
binding to the target, which has increased binding 
strength or increased specificity of binding compared 



SUBSTITUTE SHZBl 



wo 91/19813 



PCrAJS91/04078 



13 

to the first nucleic acid ligand isolated. The 
secpience of the second nucleic acid ligand associated 
with binding to the target can then be introduced into 
the vari£j3le portion of the nucleic acids of a third 
5 test mixture which, after cycles of SELEX results in a 

third nucleic acid ligand. These procedures can be 
repeated until a nucleic acid ligand of a desired 
binding strength or a desired specificity of binding to 
the target molecule is achieved. The process of 
10 iterative selection and combination of nucleic acid 

sequence elements that bind to a selected target 
molecule is herein designated "walking," a term which 
implies the optimized binding to other accessible areas 
of a macromolecular target surface or cleft, starting 
15 from a first binding domain. Increasing the area of 

binding contact between ligand and target can increase 
the affinity constant of the binding reaction. These 
walking procedures are particularly useful for the 
isolation of nucleic acid antibodies which are highly 
20 specific for binding to a particular target molecule. 

A variant of the walking procedure employs a 
non-nucleic acid ligemd termed "anchor" which binds to 
the target molecule as a first binding domain* (See 
Fig. 9.) This anchor molecule can in principle be any 
25 non-nucleic acid molecule that binds to the target 

molecule and which can be covalently linked directly or 
indirectly to a nucleic acid. When the target molecule 
is an enzyme, for example, the anchor molecule can be 
an inhibitor or substrate of that enzyme. The anchor 
30 can also be an antibody or antibody fragment specific 

for the target. The anchor molecule is covalently 
linked to a nucleic acid oligomer of known sequence to 
produce a bridging molecule. The oligomer is 
preferably comprised of a minimum of about 3-10 
35 bases. A test mixture of candidate nucleic acids is 

then prepared which includes a randomized portion and a 
sequence complementary to the known sequence of the 



SUBSTfTUTE SKEEl 



wo 91/19813 



PCr/US91/04078 



14 



10 



15 



bridging molecule. The bridging molecule is complexed 
to the target molecule. SELEX is then applied to 
select nucleic acids which bind to the complex of the 
bridging molecule and the target molecule. Nucleic 
acid ligands which bind to the complex are isolated. 
Walking procedures as described above can then be 
applied to obtain nucleic acid ligands with increased 
binding strength or increased specificity of binding to 
the complex. Walking procedures could employ 
selections for binding to the complex or the target 
itself. This method is particularly useful to isolate 
nucleic acid ligands which bind at a particular site 
within the target molecule. The complementary sequence 
in the test mixture acts to ensure the isolation of 
nucleic acid sequences which bind to the target 
molecule at or near the binding site of the bridging 
molecule. If the bridging molecule is derived from an 
inhibitor of the target molecule, this method is likely 
to result in a nucleic acid ligand which inhibits the 
20 function of the target molecule, it is particularly 

useful, for example, for the isolation of nucleic acids 
which will activate or inhibit protein function. The 
combination of ligand and target can have a new or 
enhanced function. 

nucleic acid ligands of the present 
invention may contain a plurality of ligand components. 
As described above, nucleic acid ligands derived by 
walking procedures may be considered as having more 
than one nucleic acid ligand component. This invention 
30 also includes nucleic acid antibodies that are 

constructed based on the results obtained by SELEX 
while not being identical to a nucleic acid ligand 
identified by SELEX. For example, a nucleic acid 
antibody may be constructed wherein a plurality of 
identical ligand structures are made part of a single 
nucleic acid. In another embodiment, SELEX may 
identify more than one family of nucleic acid ligands 



35 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



15 

to a given target:. In such case, a single nucleic acid 
antibody may be constructed containing a plurality of 
different ligand structures. SELEX experiments also 
may be performed wherein fixed identical or different 
5 ligand structures are joined by random nucleotide 

regions and/or regions of varying distance between the 
fixed ligand structures to identify the best nucleic 
acid antibodies. 

Screens, selections or assays to assess the 
10 effect of binding of a nucleic acid ligand on the 

function of the target molecule can be readily combined 
with the SELEX methods. Specifically, screens for 
inhibition or activation of enzyme activity can be 
combined with the SELEX methods. 
15 In more specific embodiments, the SELEX method 

provides a rapid means for isolating and identifying 
nucleic acid ligands which bind to proteins, including 
both nucleic acid-binding proteins and proteins not 
known to bind nucleic acids as part of their biological 
20 function. Nucleic acid-binding ' proteins include among 

many others polymerases and reverse transcriptases. 
The methods can also be readily applied to proteins 
which bind nucleotides, nucleosides, nucleotide co- 
factors and structurally related molecules. 
25 In another aspect, the present invention 

provides a method for detecting the presence or absence 
of, and/ or measiiring the amount of a target molecule in 
a sample, which method employs a nucleic acid ligand 
which can be isolated by the methods described herein. 
30 Detection of the target molecule is mediated by its 

binding to a nucleic acid ligand specific for that 
target molecule. The nucleic acid ligand can be 
labeled, for example radiolabled, to allow qualitative 
or quantitative detection. The detection method is 
35 particularly useful for target molecules which are 

proteins. The method is more particularly useful for 
detection of proteins which are not known to bind 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/l)S9I/04078 



16 



10 



15 



nucleic acids as part of their biological function. 
Thus, nucleic acid ligands of the present invention can 
be employed in diagnostics in a manner similar to 
conventional antibody-based diagnostics. One advantage 
of nucleic acid ligands over conventional antibodies in 
such detection method and diagnostics is that nucleic 
acids are capable of being readily amplified in vitro , 
for example, by use of PCR amplification or related 
methods. Another advantage is that the entire SELEX 
process is carried out in yityo and does not require 
immunizing test animals. Furthermore, the binding 
affinity of nucleic acid ligands can be tailored to the 
user's needs. 

Nucleic acid ligands of small molecule targets 
are useful as diagnostic assay reagents and have 
therapeutic uses as sequestering agents, drug delivery 
vehicles and modifiers of hormone action. Catalytic 
nucleic acids are selectable products of this 
invention. For example, by selecting for binding to 
transition state analogs of an enzyme catalyzed 
reaction, catalytic nucleic acids can be selected. 

m yet another aspect, the present invention 
provides a method for modifying the function of a 
target molecule using nucleic acid ligands which can be 
25 isolated by SELEX. Nucleic acid ligands which bind to 

a target molecule are screened to select those which 
specifically modify function of the target molecule/ 
for example to select inhibitors or activators of the 
function of the target molecule. An amount of the 
selected nucleic acid ligand which is effective for 
modifying the function of the target is combined with 
the target molecule to achieve the desired functional 
modification. This method is particularly applicable 
to target molecules which are proteins, a particularly 
useful application of this method is to inhibit protein 
function, for example to inhibit receptor binding to an 
effector or to inhibit enzyme catalysis, m this case. 



20 



30 



35 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



17 

an amount: of "the selected nucleic acid molecule which 
is effective for target protein inhibition is combined 
with the target protein to achieve the desired 
inhibition. 

5 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a diagram of the ribonucleotide 
sequence of a portion of the gene 43 messenger RNA 
which encodes the bacteriophage T4 DNA polymerase. 

10 Shown is the sequence in the region known to bind to 

gp43. The bold-faced capitalized letters indicate the 
extent of the information required for binding of gp43. 
The eight base-pair loop was replaced by randomized 
sequence to yield a candidate population for SELEX. 

15 Figure 2 is a schematic diagram of the SELEX 

process as exemplified for selecting loop sequence 
variants for RNAs that bind to T4 DNA polymerase 
(gp43) • A DNA template for preparation of a test 
mixture of RNAs was prepared as indicated in step a by 

20 ligation of oligomers 3, 4 and 5, whose sequences are 

given in Table 1 infra. Proper ligation in step a was 
assured by hybridization with oligomers 1 and 2, which 
have complementary sequence (given in Table 1) that 
bridges oligomers 3 and 4 and 4 and 5, respectively. 

25 The resulteuit 110-base long template was gel-purified, 

annealed to oligo 1 and was used in vitro transcription 
reactions (Miligan et al. (1987) Nucl. Acids Res. 
15:8783-8798) to produce an initial RNA mixture 
containing randomized sequences of the 8-base loop, 

30 step b. The resultant transcripts were gel-purified 

and subjected to selection on nitrocellulose filters 
for binding to gp4 3 (step c) , as described in Example 
1, Selected RNAs were amplified in a three step 
process: (d) cDNA copies of the selected RNAs were 

35 made by reverse transcriptase synthesis using oligo 5 

(Table 1) as a primer; (e) cDNAs were amplified using 
Tag DNA polymerase chain extension of oligo 1 (Table 



SUBSTITUTE SHEEl 



PCr/US91/04078 



18 



1 , Which carries essential T7 promoter sequences, and 
oligo 5 (Table i) as described in innis et al. (i988) 
Proc. Natl. Acad. Sci. USA fi5:9436; and (f) double- 
stranded DNA products Of amplification were transcribed 
in 2CiiES. The resultant selected amplified rnas were 
used in the next round of selection. 

Figure 3 is a composite of autoradiographs of 
electrophoresed batch sequencing reactions of the ia 
SUaa transcripts derived from SELEX for binding of rna 
loop variants to gp43. The figure indicates the change 
xn loop sequence components as a function of number of 
selection cycles (for 2, 3 and 4 cycles, for selection 
conditions of experiment B in which the concentration 
Of gp43 was 3 X lO*" m and the concentration of rna was 
about 3 X 10- M in all selection cycles. Sequencing 
vas performed as described in Gauss et al. (1987) Mol. 
Gen. Genet. 24-34. 

Figure 4 is a composite of autoradiographs of 
batch RNA sequences of those RNAs selected from the 
fourth round of SEI^X amplification for binding of rna 
loop variants to gp43 employing different binding 
conditions, m experiment A gp43 concentration was 3 X 
10 M and UNA concentration was about 3 x lo"^ m m 
e^^eriment B, gp43 was 3 x lO'^ M and RNA was about 3 x 
10 M. In experiment c, gp43 was 3 x lo'^ M and RNA 
was about 3 x lo"' m. 

Figure 5 is a composite of autoradiographs of 
three sequencing gels for loop variants selected for 
binding to gp43 under the selection conditions of 
experiment B (see Example i) . The left hand sequence 
gel is the batch sequencing of selected RNAs after the 
fourth round of selection/amplification. The middle 
and right hand sequence gels are double-stranded DNA 
sequencing gels of two clonal isolates derived form the 
batch PNAs. The batch of RNA selected is composed of 
two major variants, one of which was the wild-type 
sequence (middle sequence gel) , and a novel sequence 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04078 



19 

(right hand gel) • 

Figure 6 is a graph of percent RNA botuid to 
gp43 as a function of gp43 concentration for different 
selected SNA loop sequence variants and for PNA with a 
5 randomized loop sequence. Binding of the wild-type 

loop sequence AAUAACUC is indicated as open circles, 
solid line; major variant loop sequence AGCAACCU as 
"X," dotted line; minor variant loop sequence AAUAACUU 
as open squares, solid line; minor variant loop 
10 sequence AAUGACUC as solid circles, dotted line; minor 

variant loop secpience A6CGACCU as crosses, dotted line; 
and binding of the randomized mixture (NNNNNNNN) of 
loop sequences as open circles, dotted line. 

Figure 7 is a pictorial siunmary of results 
15 achieved after four rounds of SELEX to select a novel 

gp43 binding RNA from a candidate population randomized 
in the eight base-pair loop. SELEX did not yield the 
"apparent" consensus expected from the batch sequences 
shown in Figxure 4, but instead yielded wild type and a 
20 single major variant in about equal proportions and 

three single mutants. The frequencies of each species 
out of twenty cloned isolates are shown together with 
the approximate affinity constants (Kd) for each, as 
derived from filter binding assays shown in Figure 6* 
25 Figure 8 is a series of diagrams showing 

synthesis of candidate nucleic acid ligands using the 
enzymes terminal transferase (TDT) and DNA polymerase 
(DNA pol) . A 5» primer or primary ligand sequence is 
provided with a tail of randomized sequence by 
30 incubating with terminal transferase in the presence of 

the four deoxynucleotide triphosphates (dNTPs) . 
Homopolymer tailing of the randomized segment, using 
the same enzyme in the presence of a single 
deoxynucleotide triphosphate (e.g. dCTP) provides an 
35 annealing site for poly-G tailed 3» primer. After 

annealing, the double-stranded molecule is completed by 
the action of DNA polymerase. The mixture can be 



SUBSTITIJTP ftMPsn 



PCr/US91/04078 

20 

^er aMplifiea, if desired, by the polymerase chain 
reaction. 

Fig. 9 is a diagram showing a process using 
SEI^X to select a large nucleic acid ligand having two 
spatxally separate binding interactions with a target 
protein. The process is termed "walking" since it 
includes two stages, the second being an extension of 
the first. The upper part of the figure depicts a 
target ("protein of interest") with a bound nucleic 
acxd ixgand selected by a first round of SBLEX 
revolved primary ligand") bound to the protein at a 

transferase extends the length of the evolved primary 
ligand and generates a new set of randomized sequence 
candidates having a conserved region containing the 
primary ixgand. The lower part of the figure depicts 
the result of a second round of SELEX based upon 
^proved binding that results from the secondary ligand 
xnteractxon at the secondary binding site of the 
protein. The terms "primary" and "secondary" are 
merely operative terms that do not imply that one has • 
hxgher affinity than the other. 

orone. " °^ - selection 

process usxng SELEX in two stages, m Figure lo, SELEX 
xs applxed to select ligands that bind to secondary 
bxndxng sxtes on a target complexed with a bridging 
olxgonucleotide connected to a specific binder, e.g;, 
xnhxbxtor of the target protein. The bridging 
Oligonucleotide acts as a guide to favor selection of 
ixgands that bind to accessible secondary binding 
sites, m Figure li, a second SELEX is applied to 
evolve ligands that bind at both the secondary sites 
orxgxnally selected for and the primary target domain. 
The nuclexc acxds thereby evolved will bind very 
txghtly, and may themselves act as inhibitors of the 
target protein or to compete against inhibitors or 
substrates of the target protein. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



21 

Figiire 12 shows the sequence and placement of 
oligomers used to construct the candidate mixture used 
in Example 2. The top line shows the sequences of 
oligomers lb and 2b from left to right, respectively 
5 (see Table 2 infra). The second line shown, from left 

to right, the sequences of oligomers 3b, 4b and 5b 
(Table 2) . Proper ligation of the oligomers was 
assiired by hybridization with oligomers lb and 2b, 
whose sequences are complementary. The resultant 
10 ligated template was gel-purified, annealed to oligomer 

lb and used in an in vitro transcription reaction 
(Milligan et al. (1987)) to produce an RNA candidate 
mixture, shown in the last line of the figure, labeled 
"in vitro transcript." The candidate mixture contained 
15 a 32 nucleotide randomized segment, as shown. 

Figure 13 shows a hypothetical RNA sequence 
containing a variety of secondary structures that RNA 
are known to undertake. Included are: A hairpin 
loops, B bulges, c asymmetric bulges, and D 
2 0 pseudoknots . 

Figure 14 shows nitrocellulose filter binding 
assays of ligand affinity for HIV-RT. Shown is the 
percent of input RNA that is bound to the 
nitrocellulose filter with varying concentrations of 
25 HIV-RT. 

Figure 15 shows additional nitrocellulose 
filter binding assays of ligand affinity for HIV-RT. 

Figure 16 shows information boundary 
determination for HIV-1 RT ligands 1.1 and 1.3a a) 3* 

30 boundary determination. RNAs were 5* end labeled, 

subjected to partial alkaline hydrolysis and selection 
on nitrocellulose filters, separated on a denaturing 8% 
polyacrylamide gel and autoradiographed. Approximately 
90 picomoles of labeled RNA and 8 0 picomoles of HIV-1 

35 RT were mixed in 0.5, 2.5, and 5 mis of buffer and 

incubated for 5 minutes at 37*^ C prior to washing 
through a nitrocellulose filter. The eluated RNAs are 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrA»91/04078 



Id 



22 

shown luider the final concentrations of HIV-i rt used 
in each experiment. Also shown are the products of a 
partial RNase Tl digest which allows identification of 
the information boundary on the adjacent sequence as 
shown by arrows b) 5* boundary determination. The 5' 
boundary was determined in a) under the same conditions 
listed above. 

Figure 17 shows the inhibition of Hlv-i rt by 
RNA ligand 1.1. A series of three-fold dilutions of 
32N candidate mixture RNA and ligand l.i rna ranging in 
final reaction concentration for 10 micromolar to 4.6 
nanomolar and pre-mixed with HIV-RT and incvibated for 5 
minutes at 37*0 in S/tl. of 200 mM KOAc, 50 mM Tris-HCl, 
PH 7.7, 10 mM dithiothreitol, 6 mM Mg (OAc)2, and 0.4' 
15 mM NTPS. In a separate tube RNA template (transcribed 

from a PCR product of a T7-1 obtained from U.S. 
Biochemical Corp. using oligos 7 and 9) and labeled 
oligo 9 were mixed and heated at 95"c for one minute 
and cooled on ice for 15 minutes in 10 mM Tris-HCl, pH 
20 7, 0.1 mM EDTA. Four nl of this template was added to 

each 6Ml enzyme-inhibitor mixture to start the reaction 
which was incubated for a further 5 minutes at 37°C and 
then stopped. The final concentration of HIV-l rt was 
16 nanomolar, of RNA template was 13 nanomolar, and of 
25 labeled primer was 150 nanomolar in all reactions. The 

extension products of each reaction are shown. 

Figure 18 shows a comparisons of HIV-l RT 
inhibition by ligand l.i to effects on MMLV RT and AMV 
RT. E3q)eriments were performed as in Figure 17 except 
that 5-fold dilutions of inhibitor were prepared with 
the resxiltant concentrations as shown. The 
concentrations of each RT were normalized to that of 
HIV-RT by dilutions and comparison of gel band 
intensity with both Coomassie blue and silver stains, 
Biorad protein concentration assays, and activity 
assays . 

Figure 19 shows the consensus sequences of 



30 



35 



SUBSTJTUTE SHEEl 



wo 91/19813 



PCT/US91/04078 



23 

selected hairpins representing the R-17 coat protein 
ligand solution. The nucleotide representation at each 
position is indicated in. grids* The column headed 
"bulge" represents the number of clones with an 
5 extra-helical nucleotide on one or both sides of the 

stem between the corresponding stem base-pairs. The 
column headed "end" represents the number of clones 
whose hairpin terminated at the previous base-pair. 

Figure 20 shows a binding curve of 3 ON bulk RNA 
10 for bradykinin. Anaylsis was done using spin colximns; 

10 mM KOAc, 10 mM DEM, pH 7.5; RNA concentration 1.5 x 
10*^. 

Figure 21 shows templates for use in the 

generation of candidate mixtures that are enriched in 
15 certain structural motifs. Template A is designed to 

enrich the candidate mixture in hairpin loops. 

Template B is designed to enrich the cemdidate mixture 

in pseudoknots. 

Figure 22 is a schematic diagram of stem-loop 
20 arremgements for* Motifs I and ZI of the HIV-rev ligemd 

solution. The dotted lines in stems 1 and 2 between 

loops 1 and 3 indicate potential base-pairs. 

Figure 23 shows the folded secondary structures 

of rev ligand subdomains of isolates 6a, la, and 8 to 
25 show motifs I, II and III respectively. Also shown for 

comparison is the predicted fold of the wild type RRE 

RNA. 

Figure 24 is a graph of percent of input counts 
bound to a nitrocellulose filter with various 

30 concentrations of HIV rev protein. Also shown are the 

binding curves of the 32N starting population {#) and 
of the evolved population after 10 rounds (P) and of 
the wild type RRE sequence transcribed from a template 
composed of oligos 8 and 9 (W) • 

35 Figure 25 is a comparison of Motif 1(a) rev 

ligands. Paramenters are as in Figure 24. Also 
included is the binding curve of the "consensus" 



SUBSTITUTE SHEE7 



24 

construct (C) . 

Figure 26 is a comparison of Motif l(b) rev 
ligands. Parameters are as in Figure 24. 

Figure 27 is a comparison of Motif ii rev 
ligands. Parameters are as in Figure 24. 

Figure 28 is a comparison of Motif m rev 
ligands. Parameters as in Figure 24. 

Figure 29 shows the consensus nucleic acid 
ligand solution to HIV rev referred to as Motif i. 

Figure 30 shows the consensus nucleic acid 
ligand solution to HIV rev referred to as Motif ii. 

Figure 31 is a schematic representation of a 
pseudoknot. The pseudoknot consists of two stems and 
three loops, referred to herein as stems s, and s, and 
loops 1, 2 and 3. 



PBTAILBn DESgPTiyP Tow op Ty i g TMvini7 Tnw 

The following terms are used herein according 
to the definitions. 

Nucleic acid means either DNA, RNA, single- 
stranded or double-stranded and any chemical 
modifications thereof, provided only that the 
modification does not interfere with amplification of 
selected nucleic acids, such modifications include, 
but are not limited to, modifications at cytosine 
exocyclic amines, substitution of 5-bromo-uracil 
backbone modifications, methylations, unusual base- " 
paxring combinations and the like. 

Ligand means a nucleic acid that binds another 
molecule (target) . m a population of candidate 
nucleic acids, a ligand is one which binds with greater 
affinity than that of the bulk population, m a 
candidate mixture there can exist more than one ligand 
for a given target. The ligands can differ from one 
another in their binding affinities for the target 
molecule* 

candidate mixture is a mixture of nucleic acids 



SUBSTITUTE SHEE1 



25 



of differing sequence, from which to select a desired 
ligemd. The source of a candidate mixture can be from 
naturally-occurring nucleic acids or fragments thereof, 
chemically synthesized nucleic acids, enzymically 
synthesized nucleic acids or nucleic acids made by a 
combination of the foregoing techniques. 

Target molecule means any compound of interest 
for which a ligand is desired. A target molecule can 
be a protein, peptide, carbohydrate, polysaccharide, 
glycoprotein, hoanaone, receptor, antigen, antibody, 
virus, substrate, metabolite, transition state analog, 
cof actor, inhibitor, drug, dye, nutrient, growth 
factor, etc., without limitation. 

Partitioning means any process whereby ligands 
bound to target molecules, termed ligand-target pairs 
herein, can be separated from nucleic acids not bound 
to target molecules. Partitioning can be accomplished 
by various methods known in the art. Nucleic acid- 
protein pairs can be bound to nitrocellulose filters 
while unbound nucleic acids are not. Columns which 
specifically retain ligand-target pairs (or 
specifically retain bound ligand complexed to an 
attached target) can be used for partitioning. Liquid- 
liquid partition can also be used as well as filtration 
gel retardation, and density gradient centrifugation. 
The choice of partitioning method will depend on 
properties of the target and of the ligand-target pairs 
and can be made according to principles and properties 
known to those of ordinary skill in the art. 

Amplifying means any process or combination of 
process steps that increases the amount or number of 
copies of a molecule or class of molecules. Amplifying 
RNA molecules in the disclosed examples was carried out 
by a sequence of three reactions: making cDNA copies 
of selected RNAs, using polymerase chain reaction to 
increase the copy number of each cDNA, and transcribing 
the cDNA copies to obtain RNA molecules having the same 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/lJS91/04078 



26 



sequences as the selected RNAs. Any reaction or 
combination of reactions known in the art can be used 
as appropriate, including direct DNA replication, 
direct RNA amplification and the like, as will be 
5 recognized by those skilled in the art. The 

amplification method should result in the proportions 
of the amplified mixture being essentially 
representative of the proportions of different 
sequences in the initial mixture. 
^° Specific binding is a term which is defined on 

a case-by-case basis, m the context of a given 
interaction between a given ligand and a given target, 
a binding interaction of ligand and target of higher ' 
affinity than that measured between the target and the 
15 candidate ligand mixture is observed, in order to 

compare binding affinities, the conditions of both 
binding reactions must be the same, and should be 
comparable to the conditions of the intended use. For 
the most accurate comparisons, measurements will be 
made that reflect the interaction between ligand as a 
whole and target as a whole. The nucleic acid ligands 
of the invention can be selected to be as specific as 
required, either by establishing selection conditions 
that demand the requisite specificity during SELEX, or 
by tailoring and modifying the ligand through "walking" 
and other modifications using interactions of SELEX. 

Randomized is a term used to described a 
segment of a nucleic acid having, in principle any 
possible sequence over a given length. Randomized 
sequences will be of various lengths, as desired, 
ranging from about eight to more than 100 nucleotides. 
The chemical or enzymatic reactions by which random 
sequence segments are made may not yield mathematically 
random sequences due unknown biases or nucleotide 
35 preferences that may exist. The term "randomized" is 

used instead of "random" to reflect the possibility of 
such deviations from 



20 



25 



30 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



27 

non-ideality. In the techniques presently known, for 
example sequential chemical synthesis, large deviations 
are not known to occur. For short segments of 20 
nucleotides or less, any minor bias that might exist 
5 would have negligible consequences. The longer the 

sequences of a single synthesis, the greater the effect 
of any bias. 

A bias may be deliberately introduced into 
randomized sequence, for example, by altering the molar 

10 ratios of precursor nucleoside (or deoxynucleoside) 

triphosphates of the synthesis reaction. A deliberate 
bias may be desired, for example, to approximate the 
proportions of individual bases in a given organism, or 
to affect secondary structure. 

15 SELEXION refers to a mathematical analysis and 

computer simulation used to demonstrate the powerful 
ability of SELSX to identify nucleic acid ligands and 
to predict which variations in the SELEX process have 
the greatest impact on the optimization of the process. 

20 SELEXION is an acronym for Systematic Evolution of 

Ligands by Exponential enrichment with Integrated 
Optimization by Nonlinear analysis. 

Nucleic acid antibodies is a term used to refer 
to a class of nucleic acid ligands that are comprised 

25 of discrete nucleic acid structures or motifs that 

selectively bind to target molecules. Nucleic acid 
antibodies may be made up of doiible or single stranded 
RNA or DNA. The nucleic acid antibodies are 
synthesized, and in a preferred embodiment are 

30 constructed based on a ligand solution or solutions 

received for a given target by the SELEX process. In 
many cases, the nucleic acid antibodies of the present 
invention are not naturally occurring in nature, while 
in other situations they may have significant 
35 similarity to a naturally occurring nucleic acid 

sequence • 

The nucleic acid antibodies of the present 



SUBSTITUTE SHEE1 



wo 91/19813 

PCr/US91/04078 



28 



inventxon inclixd. all nucleic acids having a specific 
b^dxn, amnity tor a ta^at, whil. net L Jin;^^ 
case. «he„ the target is a polynucleotia. which hL^ 
to the nuclexc acia through a »echanis. vhich 

^IplThlT"" "'"^"^ °" "atson/crioc base pairing cr 
tr^le helix agents (ges, Rlordan, M. et al. (1991. 
Nature 350:442-443), prcvided, howeverT^t Ln the 
nucleic acid antihcdy is douhle-strand:d Z 
10 ■ ""f ' -"^^y o«-ring protein whose 

do^le ™ --'=«^= ''l""-^ to 

aouble-stranded DNA. 

RHA Botifs is , term generally used to describe 
secondary or tertiary structure o, RHA ■»lecui.e 

(A, c, G or U) an one dimension. The 

primary sequence does not give info,-ma*.4 

information on first 

oe^! "'^"^^ °'>*=-i»«<^ after 

performing SELEX on a given target may best be 

represented as a primary sequence. Although 

rutiTi:"'.^'^^^'"" pertaining to such a iigand 
solution IS not always ascertainable based on the 
25 results obtained bv selpv i-k 

n,T«r,H , ^. ' ^ ^representation of a 

ixgand solution as a primary sequence shall not be 
interpreted as disclaia^ing the existence of an integral 
tertiary structure. integral 

H>e secondary structure of an RHA motif is 

specific nucl«rt:ides. The most easily recognised 
secondary structure motifs are comprised of^" 
Watson/Crick basepairs A=o and c=G. »on-»atso„/Crick 
basepairs, often of lower stability, have been 

n.U (Base pairs are shown once, in rha molecules the 
base pair X=V by convention represents a sec,uence iT 



SUBSTITUTE SHEE1 



29 



which X is 5' to Y, whereas "the base pair Y:X is also 
allowed.) In Figure 13 are shown a set: of secondary 
structures, linked by single-stranded regions; the 
conventional nomenclature for the secondary structures 
includes hairpin loops, asymmetric bulged hairpin 
loops, symmetric hairpin loops, and pseudo]cnots . 

When nucleotides that are distant in the 
primary sequence and not thought to interact through 
Watson/Crick and non-Watson/Crick base pairs are in 
fact interacting, these interactions (which are often 
depicted in two diiaensions) are also part of the 
secondary structure. 

The three dimensional structure of an KNA motif 
is merely the description, in space, of the atoms of 
the RNA motif. Double-stranded RNA, fully base paired 
through Watson/ Crick pairing, has a regular stiructure 
in three dimensions, although the exact positions of 
all the atoms of the helical backbone could depend on 
the exact secpience of bases in the RNA. A vast 
literature is concerned with secondary structures of 
RNA motifs, and those secondary structures containing 
Watson/Crick base pairs are thought often to form A- 
form double stranded helices. 

From A-form helices one can extend toward the 
other motifs in three dimensions. Non-Watson/Crick 
base pairs, hairpin loops, bulges, and pseudoknots are 
structures built within and upon helices. The 
construction of these additional motifs is described 
more fully in the text. 

The actual structure of an RNA includes all the 
atoms of the nucleotide of the molecule in three 
dimensions. A fully solved structure would include as 
well bound water and inorganic atoms, although such 
resolution is rarely achieved by a researcher. Solved 
RNA structures in three dimensions will include all the 
secondary structure elements (represented as three 
dimensional structures) and fixed positions for the 



SUBSTITUTE SHEET 



PCr/US91/04078 



30 



atoms of nucleotides not restrained by secondary 
structure elements; due to base stacking and other 
forces extensive single stranded domains may have fixed 
structures. 

Primary sequences of rnas limit the possible 
three dimensional structures, as do the fixed secondary 
structures. The three dimensional structures of an RNA 
are limited by the specified contacts between atoms in 
two dimensions, and are then further limited by energy 
Bxnimizations, the capacity of a molecule to rotate all 
freely rotatable bonds such that the resultant molecule 
xs more stable than other conformers having the same 
primary and secondary sequence and structure. 

Most importantly, rna molecules have structures 
in three dimensions that are comprised of a collection 
Of RNA motifs, including any number of the motifs shown 
m Fxgure 13. 

Therefore, rna motifs include all the ways in 
Which It is possible to describe in general terms the 
most stable groups of conformations that a nucleic acid 
compound can form. For a given target, the ligand 
solution and the nucleic acid antibody may be one of 
the RNA motifs described herein or some combination of 
several RNA motifs. 

Ligand solutions are defined as the three 
dimensional structure held in common or as a family 
that define the conserved components identified through 
SELEX. For example, the ligands identified for a 
particular target may contain a primary sequence in 
common (NNNCGNAANUCGN.N.N) Which can be represented by 
a hairpin in two dimensions by: 



AAN 
N u 
G C 
C G 
N N' 
N N« 
N N' 



The three dimensional structure would thus be 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04078 



31 

insensi-bive to the exact sequence of three of the five 
base pairs and two of the five loop nucleotides, and 
would in all or most versions of the sequence/structure 
be an appropriate ligand for further use. Thus ligand 
5 solutions are meant to represent a potentially large 

collection of appropriate sequence/structures, each 
identified by the family description which is inclusive 
of all exact sequence/ structure solutions. It is 
further contemplated through this definition that 
10 ligand solutions need not include only members with 

exact numerical equivalency between the various 
components of an RNA motif. Some ligands may have 
loops, for example, of five nucleotides while other 
ligands for the same tairget may contain fewer or more 
15 nucleotides in the equivalent loop and yet be included 

in the description of the ligand solution. 

Although the ligand solution derived by SELEX 
may include a relatively large number of potential 
members, the ligand solutions are target specific and, 
20 for the most part, each member of the ligand solution 

feunily can be used as a nucleic acid antibody to the 
target- The selection of a specific member from a 
family of ligand solutions to be employed as a nucleic 
acid smtibody can be made as described in the text and 
25 may be influenced by a number of practical 

considerations that would be obvious to one of ordinary 
skill in the art. 

The method of the present invention developed 
in connection with investigations of translational 
30 regulation in bacteriophage T4 infection. 

Autoregulation of the synthesis of certain viral 
proteins, such as the bacteriophage T4 DNA polymerase 
(gp43), involves binding of the protein to its own 
message, blocking its translation. The SELEX method 
35 was used to elucidate the sequence and structure 

requirements of the gp43 RNA binding site. SELEX 
allowed the rapid selection of preferred binding 



SUBSTITUTE SHEEl 



wo 91/19813 

PCT/US91/a4a78 

32 

sequences from a population of random nucleic acid 
IZZT'.-"^'^" exemplified by the isolation and ' 

I Of nucleic acid sequences which bind to 
protexns known to bind to rna, the method of the 
present invention is generally applicable to the 
selection of a nucleic acid capable of binding any 
gxven protein. The method is applicable to selection 
Of nucleio acids which bind to proteins which drLt 
10 17'^^ ""^^^ ^<=id as a part of 

SELEX method requires no Jcnowledge of the structure or 
sequence of a binding site and no toowledge of the 
structure or sequence of the target protein, 

se^^ct- ^"^""^ °" ^''""'^^ ^^^^-^ P-^-in for 

selections, m general, application of SELEX will 

enrxch for ligands of the most abundant target. i„ a 

^xture Of ligands, techniques for isolating the "gand 

20 Ce-g.. substrate, inhibitor, antibody) of the 

desxred target can be used to compete specifLaLrfor 
bxndxng the target, so that the desired nucleic acid 
ixgand can be partitioned from ligands of other 
targets. 

SELEX are comprised of single etrandea H« s^juenc... 

" r : ' °' ^-"■"°» «^ 

P«eent xnventors were ^le to conclusions abo« 

^ that are contrary to those coMoonly held in the 
field, and to use these conclusions to tailor the SELEX 
process to achieve nucleic acid antihodies derl^:d f^ 
ligand solutions. «rxvea from 

RNA was first appreciated as an information 
-ssenger between the DNA sequences that are the genes 

and other protexns. From the first moments after 
Watson and Crick described the structure of dna and the 
connection between OKA sequence and protein se^ence 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



33 

the aeans by which proteins were synthesized became 
central to much experimental biochemistry. Eventually 
messenger RNA (mRNA) was identified as the chemical 
intermediate between genes and proteins. A majority of 
5 RNA species present in organisms are mRNAs, and thus 

RNA continues to be seen largely as an infomnatlonal 
molecule. RNA serves its role as an informational 
molecule largely through the primary sequence of 
nucleotides, in the same way that DNA serves its 

10 function as the material of genes through the primary 

sequence of nucleotides; that is, information in 
nucleic acids can be represented in one dimension. 

As the biochemistry of gene expression was 
studied, several RNA molecules within cells were 

15 discovered whose roles were not informational. 

Ribosomes were discovered to be the entities upon which 
mRNAs are translated into proteins, and ribosomes were 
discovered to contain essential RNA (ribosomal RNAs, or 
rRNAs) • rRNAs for many years were considered to be 

20 structural, a sort of scaffold upon which' the protein 

components of the ribosome were '■hung" so as to allow 
the protein components of the ribosome to perform the 
protein synthetic action of the ribosome. An 
additional large class of RNAs, the transfer RNAs 

25 (tRNAs) , were postulated and found. tRNAs are the 

chemically bifunctional adapters that recognize codons 
within mRNA and carry the amino acids that are 
condensed into protein. Most importantly, even though 
a tRNA structure was determined by X-ray analysis in 

30 1974, RNAs were considered to be primarily "strings" in 

one dimension for an additional decade. rRNA occupied 
a strange position in the research community. For a 
long period almost no one sensed the reason behind the 
deep similarities in rRNAs from various species, and 

35 the true chemical capacity of RNA molecules. Several 

researchers postulated that RNA might once have served 
an enzymatic rather than informational role, but these 



W091/l!»813 

PCr/US91/04078 

34 

postulates were never intended to be predictive about 
present functions of rna. 

Tom cech.s work on ribozymes - a new class of 
RNA molecules - expanded the view of the functional 
5 capacity of RNA. The group i introns are able to 

splice autocatalytically, and thus at least some 
limited catalysis is within the range of rna. within 
this range of catalysis is the activity of the rna 
component of RNase P, an activity discovered by Altman 
and Pace. Cech and Altman received the Nobel Prize in 
^emxstry for their work, which fundamentally changed 
the previous limitations for RNA molecules to 
xnformational roles. rRNAs, because of the work of 
Cech and Altman, are now thought by some to be the 
catalytic center of the ribosome, and are no longer 
thought to be merely structuraa. 

RNA «oi " * ''^"'^^^ "^'^^^ Invention that 

RNA molecules remain underestimated by the research 
community, with respect to binding and other 

• ^ave caused a remarkable 

xncrease xn research aimed at rna functions, the 
present application contemplates that the shape 
possibilities for rna molecules (and probably DNA as 
well) afford an opportunity to use SELEX to find RNAs 
wxth virtually any binding function. It is furth^ 
contemplated that the range of catalytic functions 
possxble for RNA is broad beyond the present 
conventional wisdom, although not necessarily as broad 
as that of proteins. 
30 The three dimensional shapes of some rnas are 

known directly from either X-ray diffraction or NMR 
methodologies. The existing data set is sparse. The 
structures of four tRNAs have been solved, as well as 
smair '"'f ^ -lecules: two small hairpins and a 
small pseudoknot. The various tRNAs, while related 
have elements of unique structure; for example, the 
antxcodon bases of the elongator tRNAs are displayed 



SUBSTITUTE SHEET 



35 



toward the solvent, while the anticodon bases of an 
initiator tRNA are pointed more away from the solvent. 
Some of these differences may result from crystal 
lattice packing forces, but some are also no doubt a 
result of idiosyncratic energy minimization by 
different single stranded sequences within homologous 
secondary and three dimensional structures. 

Sequence variations of course are vast. If a 
single stranded loop of an RNA hairpin contains eight 
nucleotides, 65,536 different sequences comprise the 
saturated sec(uence "space." Although not bound to the 
theory of this assertion, the inventors of this 
invention believe that each member of that set will 
have, through energy minimization, a most stable 
structure, and the bulk of those structures will 
present subtly distinct chemical surfaces to the 
solvent or to potential interacting target molecules 
such as proteins. Thus, when all 65,536 sequences 
within a particular structural motif were tested 
against the bacteriophage T4 DNA polymerase, two 
sequences from that set bound better than all others. 
This suggests that structural aspects of those two 
sequences are special for that target, and that the 
remaining 65,534 sequences are not as well suited for 
binding to the target. It is almost certain that 
within those 65,536 sequences are other individual 
members or sets that would be best suited for 
interacting with other targets. 

A key concept in this description of KNA 
structures is that every sequence will find its most 
stable structure, even though RNAs are often drawn so 
as to suggest a random coil or floppy, unstructured 
element. Homopolymers of RNA, unable to form 
Watson/ Crick base pairs, are often found to have a non- 
random structure attributed to stacking energy gained 
by fixing the positions of adjacent bases over each 
other. Clearly sequences involving all four 



SUBSTITUTE SHEET 



PCr/U591/04078 



36 



nucleotides nay have local regions of fixed structure, 
and even without Watson/Crick base pairs a non-uniform 
sequence may have more structure than is at first 
presumed. The case for fixed structures in RNA loops 
as even stronger. The anticodon loops of tRNAs have a 
structure, and so 

do-presumably-the two winning sequences that bind 
best to T4 DNA polymerase. 

1« strands of complementary sequence 

xn BNA yield A-form helices, from which loop sequences 
emerge and return. Even if the loop sequences do not 
have a strong capacity to interact, energy minimization 
is an energetically free structure optimization (that 
is, no Obvious energies of activation block energy 
min^ization of a loop sequence), a kinetically likely 
starting point for optimization may be the loop closing 
base pair of an RNA stem, which presents a flat surface 
upon Which optimal stacking of loop nucleotides and 
bases may occur. Loops of rna are in principle 

Xt^^^lT "-""^^ °^ connecting antiparallel 

alpha-helxces or beta-strands. Although these protein 
loops are often called random coils, they are neither 
random nor coiled, such loops are called "omega" 
structures, reflecting that the loop emerges and 
returns to positions that are relatively close to each 
other (Sge, Leszczynski, J. and Rose, G. ^t^. (1986) 
Scxence 111:849-855); those positions in a protein are 
conceptually equivalent to the loop closing base pair 
of an RNA hairpin. 

Many omega structures have been solved by x-ray 
diffraction, and the structures are idiosyncratic. 
Clearly each structure is the result of a unique energy 
mxnxmxzatxon acted upon a loop whose ends are close t^ 
each other. Both in proteins and RNAs those loops will 
energy minimize without information from the rest of 
the structure except, to a first approximation, the 
loop closing pair of amino acids or base pair. For 



SUBSTITUTE SHEET 



37 

bo-th protein omega loops and RNA hairpin loops all Hhe 
freely rotatable bonds will participate in the attempt 
to minimize the free energy. RNA, it seems, will be 
rather more responsive to electrostatics than proteins, 
while proteins will have many more degrees of freedom 
than RNAs. Thus, calculations of RNA structures 
through energy minimization are more likely to yield 
accxirate solution structures than are comparable 
calculations for proteins. 

Single stranded regions of both RNAs and 
protein may be held so as to extend the possible 
structure. That is, if a single stramded loop emerges 
and returns in a protein structure from parallel 
strands of alpha -helix or beta-strands, the points of 
emergence and return are further from each other than 
in the omega structures. Furthermore, the distance 
spaumed by the single stremd of peptide can be varied 
by the lengths of parallel alpha-helix or beta-strand. 

For those protein structures in which the 
single strand lies upon a fixed protein secondary 
structure, the resultant energy minimization could, in 
principle, allow interactions between the single 
stranded domain and the underlying structure. It is 
likely that amino acid side chains that can form salt 
bridges in secondary structures could do the same in 
extended single strands lying on top of regular 
secondary structures. Thus the exact structures of 
such protein regions will again be idiosyncratic, and 
very much sequence dependent. In this case the 
sequence dependence will include both the single strand 
and the underlying sequence of the secondary structure. 

Interestingly, an RNA structure known as a 
pseudoknot is analogous to these extended protein 
motifs, and may serve to display toward solvent or 
target molecules extended single strands of RNA whose 
bases are idiosyncratically arrayed toward either the 
solvent/target or an underlying RNA secondary 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



38 



Structure. Pseudoknots have, in coimnon with protein 
motifs based on loops between parallel strands, the 
capacity to alter the length of single strand and the 
sequence of the helix upon which it lies. 
5 Thus, exactly like in protein motifs, by 

covariation with sequences in the underlying secondary 
structure it is possible to display single stranded 
nucleotides and bases toward either the solvent or the 
underlying structure, thus altering the electrostatics 
10 and the functional chemical groups that are interacting 

with targets, it is important to note that such 
structure variations follow from energy minimizations, 
but only one pseudoknot structure is known, even at low 
resolution. Nevertheless, the value of this Invention 
15 arises out of the recognition that the shape and 

functional displays possible from pseudoknots are 
recognized to be nearly infinite in unique qualities. 

Both hairpin loops and the single stranded 
domain of pseudoknots are built upon antiparallel RNA 
20 helices. Helices of RNA may contain irregularities 

called bulges. Bulges can exist in one strand of a' 
helix or both, and will provide idiosyncratic 
structural features useful for target recognition. 
Additionally, helix irregularities can provide angled 
25 connections between regular helices. 

A large bulge (see Figure 13) on one strand of 
RNA may be comparable to hairpin loops, except that the 
loop Closing base pair is replaced by the two base 
pairs flemking the bulge. 

Asymmetric bulges (see Figure 13) may provide 
an elongated and irregular structure that is stabilized 
by nucleotide contacts across the bulge. These 
contacts may involve Watson/Crick interactions or any 
other stabilizing arrangement, including other hydrogen 
35 bonds and base stacking. 

Finally, when contemplating fixed RNA shapes or 
motxfs, it is instructive to consider what substantial 



30 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



39 

differences exist between RNA and proteins. Since 
protein is thought to have displaced RNA during 
evolution for those activities now carried out almost 
entirely by proteins and peptides^ including catalysis 
5 and highly specific recognition, the chemical 

properties of proteins are thought to be more useful 
than RNA for constructing variable shapes and 
activities. The standard reasoning includes the 
existence of 20 amino acids versus only four 

10 nucleotides, the strong ionic qualities of lysine, 

arginine,^ aspartic acid, and glutamic acid which have 
no counterpart in the RNA bases, the relative 
neutrality of the peptide backbone when compared to the 
strongly negative sugar-phosphate backbone of nucleic 

15 acids, the existence of histidine with a pK near 

neutrality, the fact that the side chains of the amino 
acids point toward the solvent in both alpha helices 
and beta strands, and the regular secondary structures 
of proteins. In the double stranded nucleic acids, 

20 including RNA, base pairs point the bases toward each 

other and utilize much of the chemical information 
present at the one dimensional level. Thus, from every 
angle presently understood to contribute to shape 
diversity and function, proteins are thought to be the 

25 vastly superior chemical to nucleic acids, including 

RNA. During evolution, proteins were chosen for 
recognition and catalysis over RNA, thus supporting the 
present widely held view. 

Conversely, and central to this Invention, the 

30 vast number of sequences and shapes possible for RNA 

will conceivably allow, especially with sequences never 
tested during evolutionary history, every desired 
function and binding affinity even though RNA is made 
up of only four nucleotides and even though the 

35 backbone of an RNA is so highly charged. That i&, the 

RNA motifs described above, with appropriate sequence 
specifications, will yield in space those chemical 



SUBSTITUTE SHEET 



WOM/19813 



PCr/US9]/04078 



10 



15 



20 



25 



30 



35 



40 

functions needed to provide tight and specific binding 
to most targets, it may be suggested that RNA is as 
versatile as" the immune system. That is, while the 
immune system provides a fit to any desired target, RNA 
provides those same opportunities. The enabling 
methodology described herein can utilize lo'* 
sequences, and thus try vast numbers of structures such 
that whatever intrinsic advantages proteins or 
specifically antibodies may have over RNA are 
compensated for by the vastness of the possible "pool- 
from which RNA ligands are selected, in addition, with 
the use of modified nucleotides, rna can be used that 
is intrinsically more chemically varied than natural 
BNAs. 

The SEtBX method involves the combination of a 
selection of nucleic acid ligands which bind to a 
target molecule, for example a protein, with 
amplification of those selected nucleic acids. 
Iterative cycling of the selection/amplification steps 
allows selection of one or a small number of nucleic 
acids which bind most strongly to the target from a 
pool which contains a very large number of nucleic 
acids . 

Cycling of the selection/amplification 
procedure is continued until a selected goal is 
achieved. For example, cycling can be continued until 
a desired level of binding of the nucleic acids in the 
test mixture is achieved or until a minimum number of 
nucleic acid components of the mixture is obtained (in 
the Ultimate case until a single species remains in the 
test mixture) . m many case, it will be desired to 
continue cycling until no further improvement of 
binding is achieved, it may be the case that certain 
test mixtures of nucleic acids show limited improvement 
m banding over background levels during cycling of the 
selection/amplification, m such cases, the sequence 
variation in the test mixture should be increased 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



41 

including more of 'the possible sequence variants or "the 
lengt:h of 'the sequence randomized region should be 
increased until improvements in binding are achieved. 
Anchoring protocols and/or walking techniques can be 
5 employed as well. 

Specifically, the method requires the initial 
preparation of a test mixture of candidate nucleic 
acids. The individual test nucleic acids can contain a 
randomized region flanked by sequences conserved in all 

10 nucleic acids in the mixture. The conserved regions 

are provided to facilitate amplification or selected 
nucleic acids. Since there are many such sequences 
known in the art, the choice of sequence is one which 
those of ordinary skill in the art can make, having in 

15 mind the desired method of amplification. The 

randomized region can have a fully or partially 
randomized sequence. Alternatively, this portion of 
the nucleic acid can contain subportions that are 
randomized, along with subportions which are held 

20 constant in all nucleic acid species in the mixture. 

For example, sequence regions known to bind, or 
selected for binding, to the target protein can be 
integrated with randomized regions to achieve improved 
binding or improved specificity of binding. Sequence 

25 variability in the test mixture can also be introduced 

or augmented by generating mutations in the nucleic 
acids in the test mixture during the 
selection/amplification process. In principle, the 
nucleic acids employed in the test mixture can be any 

30 length as long as they can be amplified. The method of 

the present invention is most practically employed for 
selection from a large number of sec[uence variants. 
Thus, it is contemplated that the present method will 
preferably be employed to assess binding of nucleic 

35 acid sequences ranging in length from about four bases 

to any attainable size. 

The randomized portion of the nucleic acids in 



SUBSTITUTE SHEET 



PCr/US91/04078 



42 



the test mixture can be derived in a number of ways. 
For example, full or partial sequence randomization can 
be readily achieved by direct chemical synthesis of the 
nucleic acid (or portions thereof) or by synthesis of a 
template from which the nucleic acid (or portions 
ttereof) can be prepared by use of appropriate enzymes. 
End addition, catalyzed by terminal transferase in the 
presence of nonlimiting concentrations of all four 
nucleotide triphosphates can add a randomized sequence 
to a segment. Sequence variability in the test nucleic 
acxds can also be achieved by employing size-selected 
fragments of partially digested (or otherwise cleaved) 
preparations of large, natural nucleic acids, such as 
genomic DNA preparations or cellular RNA preparations. 
In those cases in which randomized sequence is 
employed, it is not necessary (or possible from long 
randomized segments) that the test mixture contains all 
possible variant sequences, it will generally be 
preferred that the test mixture contain as large a 
number of possible sequence variants as is practical 
for selection, to insure that a maximum number of 
potential binding sequences are identified, a 
randomized sequence of 30 nucleotides will contain a 
calculated 10« different candidate sequences. As a 
practical matter, it is convenient to sample only about 
10 candidates in a single selection. Practical 
considerations include the number of templates on the 
DNA synthesis column, and the solubility of RNA and the 
target in solution. (Of course, there is no 
theoretical limit for the number of sequences in the 
candidate mixture.) Therefore, candidate mixtures that 
have randomized segments longer than 30 contain too 
Many possible sequences for all to be conveniently 
sampled in one selection, it is not necessary to 
sample all possible sequences of a candidate mixture to 
select a nucleic acid ligand of the invention, it is 
basic to the method that the nucleic acids of the test 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



43 

mixture are capable of being amplified. Thus, it is 
preferred that any conserved regions employed in the 
test nucleic acids do not contain sequences which 
interfere with amplification. 
5 The various RNA motifs described above can 

almost always be defined by a polynucleotide containing 
about 30 nucleotides. Because of the physical 
constraints of the SELEX process, a randomized mixture 
containing about 30 nucleotides is also cUbout the 

10 longest contiguous randomized segment which can be 

utilized while being able to test substantially all of 
the potential variants. It is, therefore, a preferred 
embodiment of this invention when utilizing a candidate 
mixture with a contiguous randomized region, to use a 

15 randomized sequence of at least 15 nucleotides and 

containing at least about 10^ nucleic acids, and in the 
most preferred embodiment contains at least 25 
nucleotides • 

This invention includes candidate mixtures 

20 containing all possible variations of a contiguous 

randomized segment of at least 15 nucleotides. Each 
individual member in the candidate mixture may also be 
comprised of fixed sequences flanking the randomized 
segment that aid in the amplification of the selected 

25 nucleic acid sequences. 

Candidate mixtures may also be prepared 
containing both randomized sequences and fixed 
sequences wherein the fixed sequences searve a function 
in addition to the eunplif ication process. In one 

30 embodiment of the invention, the fixed seq[uences in a 

candidate mixture may be selected in order to enhance 
the percentage of nucleic acids in the candidate 
mixture possessing a given nucleic acid motif. For 
example, the incorporation of the appropriate fixed 

35 nucleotides will make it possible to increase the 

percentage of pseudoknots or hairpin loops in a 
candidate mixture. A candidate mixture that has been 



Substitute sheet 



wo 91/19813 

PCr/US91/04078 



10 



15 



20 



25 



30 



35 



44 

prepared including fixed sequences that enhance the 
percentage of a given nucleic acid structural motif is 
therefore, a part of this invention. One skilled in 
the art, upon routine inspection of a variety of 
nucleic antibodies as described herein, will be able to 
construct, without undue experimentation, such a 
candidate mixture. Examples 2 and 8 below describe 
specific examples of candidate mixtures engineered to 
maximize preferred RNA motifs. 

Candidate mixtures containing various fixed 
sequences or using a purposefully partially randomized 
sequence may also be employed after a ligand solution 
or partial ligand solution has been obtained by SELEX. 
A new SELEX process may then be initiated with a 
candidate mixture informed by the ligand solution. 

Polymerase chain reaction (PCR) is an exemplary 
method for amplifying of nucleic acids. Descriptions 
of PGR methods are found, for example in Saiki et al. 
(1985) Science asfl: 1350-1354; Saiki et al. (1986) 
Nature aM:163-l66; Scharf et al. (1986) Science 
211:1076-1078; innis et al. (1988) Proc. Natl. Acad. 
Sci. aS:9436-9440; and in U.S. Patent 4,683,195 (Mullis 
et al.) and U.S. Patent 4,683,202 (Mullis et al.). m 
its basic form, PCR amplification involves repeated 
cycles Of replication of a desired single-stranded DNA 
(or cDNA copy of an RNA) employing specific 
oligonucleotide primers complementary to the 3' and-5« 
ends of the ssDNA, primer extension with a DNA 
polymerase, and DNA denaturation. Products generated 
by extension from one primer serve as templates for 
extension from the other primer, a related 
amplification method described in PCT published 
application WO 89/01050 (Burg et al.) requires the 
presence or introduction of a promoter sequence 
upstream of the sequence to be amplified, to give a 
double-stranded intermediate. Multiple RNA copies of 
the double-stranded promoter containing intermediate 



SUBSTITUTE SHEET 



wo 91/19813 



PCT/US91/04iy78 



45 

are then produced using RNA polymerase » The resultant 
RNA copies are treated with reverse transcriptase to 
produce additional double-stranded promoter containing 
intermediates which can then be subject to another 
5 round of amplification with RNA polymerase. 

Alternative methods of amplification include among 
others cloning of selected DNAs or cDNA copies of 
selected RNAs into an appropriate vector and 
introduction of that vector into a host organism where 

10 the vector and the cloned DNAs are replicated and thus 

amplified (Guatelli, J.C. et al, (1990) Proc. Natl. 
Acad. Sci. fi7:1874). In general^ any means that will 
allow faithful, efficient amplification of selected 
nucleic acid sequences can be employed in the method of 

15 the present invention. It is only necessary that the 

proportionate representation of sequences after 
amplification at least roughly reflects the relative 
propozrtions of sequences in the mixture before 
amplification • 

20 Specific embodiments of the present invention 

for aonplifying RNAs were based on Innis et al. (1988) 
supra . The RNA molecules and target molecules in the 
test mixture were designed to provide, after 
amplification and PGR, essential T7 promoter sequences 

25 in their 5* portions. Full-length cDNA copies of 

selected RNA molecules were made using reverse 
transcriptase primed with an ologimer complementary to 
the 3' secpiences of the selected RNAs. The resultant 
cDNAS were amplified by Tag DNA polymerase chain 

30 extension, providing the T7 promoter sequences in the 

selected DNAs. Double-stranded products of this 
amplification prpces were then transcribed in vitro . 
Transcripts were used in the next 
selection/ amplification cycle. The method can 

35 optionally include appropriate nucleic acid 

purification steps. 

In general any protocol which will allow 



Substitute £?f?rrT 



PCrAJS91/04078 



46 



selection of nucleic acids based on their ability to 
bind specifically to another molecule, i.e., a protein 
or xn the most general case any target molecule, can be 
employed in the method of the present invention, it is 
only necessary that the selection partition nucleic 
acids which are capable of being amplified. For 
example, a filter binding selection, as described in 
Example l, i„ which a test nucleic acid mixture is 
xncubated With target protein, the nucleic acid/protein 
mxxture is then filtered through a nitrocellulose 
filter and washed with appropriate buffer to remove 
free nucleic acids. Protein/nucleic acid often remain 
bound to the filter. The relative concentrations of 
protexn to test nucleic acid in the incubated mixture 
influences the strength of binding that is selected 
for When nucleic acid is in excess, competition for 
available binding sites occurs and those nucleic acids 
whxch bind most strongly are selected. Conversely, 
When an excess of protein is en^loyed, it is ejected 
that any nucleic acid that binds t6 the protein will be 
selected. The relative concentrations of protein to 
nucleic acxd employed to achieve the desired selection 

IZ, t °' -^--^^ Of the 

binding interaction and the level of any background 

binding that is present. The relative concentrations 
needed to achieve the desired selection result can be 
readily determined empirically without under 
e^erimentation. similarly, it may be necessary to 
optimize the filter washing procedure to minimize 
background binding. Again such optimization of the 
filter washing procedures is within the skill of the 
ordinary artisan. 

A mathematical evaluation of SELEX referred to 
as SELEXION has been utilized by the inventors of the 
present invention. Appendix A to this application 
includes a brief review of the mathematical analysis 
utilized to Obtain generalizations regarding SELEX 



Substitute sheet 



wo 91/19813 



PCr/US91/04078 



47 

derived from SELEXION. 

The generalizations obtained from SELEXION are 
as follows: 1} The likelihood of recovering the 
best-binding RNA in each round of SELEX increases with 
5 the number of such molecules present^ with their 

binding advantage versus the bulk RNA pool, and with 
the total amount of protein used. Although it is not 
always Intuitively obvious to laiow in advance how to 
maximize the difference in binding, the likelihood of 

10 recovering the best-binding RNA still can be increased 

by meucimizing the number of RNA molecules cuid target 
molecules seuapled; 2) the ideal nucleic acid and 
protein concentrations to be used in various rounds of 
SELEX are dependent on several factors. The 

15 experimental pareuneters suggested by SELEXION parallel 

those employed in the Examples hereto. For excunple, 
when the relative affinity of the ultimate ligand 
solution is not known — which will almost inevitcUdly be 
the case when SELEX is performed — it is preferred that 

20 the protein and nucleic acid candidate mixture 

concentrations are selected to provide a binding 
between about 3 and 7 percent of the total of nucleic 
acids to the protein target. By using this criterion 
it can be expected that a tenfold to twentyfold 

25 enrichment in high affinity ligands will be achieved in 

each round of SELEX. 

The experimental conditions used to select 
nucleic acid ligands to various targets in the 
preferred embodiment are to be selected to mimic the 

30 environment that the target would be found in vivo . 

Example 10 below indicates how changing the selection 
conditions will effect the ligand solution received to 
a particular target. Although the ligand solution to 
NGF had significant similarities under high and low 

35 salt conditions, differences were observed. Adjustable 

conditions that may be altered to more accurately 
reflect the in vivo environment of the target include. 



^Obstitute sheet 



10 



15 



WO 91/19813 

PCr/US91/04078 

48 

but are not limited to, the total ionic strength, the 
concentration of bivalent cations and the pH of the 
solution, one skilled in the art would be able to 
easily select the appropriate separation conditions 
based on a knowledge of the given target. 

in order to proceed to the amplification step, 
selected nucleic acids must be released from the ta^et 
after partitioning. This process must be done without 
ch^nxcal degradation of the selected nucleic acids and 
must result in amplifiable nucleic acids, m a 
specific embodiment, selected RNA molecules were eluted 
from nitrocellulose filters using a freshly made 
solution containing 200 Ml of a 7 M urea, 20 mM sodium 
citrate (pH 5.0), i mM EDTA solution combined with 500 
Ml Of Phenol (equilibrated with 0.1 m sodium acetate pH 
5-2) . A solution Of 200 Ml 7M urea with 500 m1 of 
phenol has been successfully employed. The eluted 

ItlT °' "'"^'^ ^^^^ -«^er, 

ethanol precxpxtated and the precipitate was 

resuspended in water, a number of different buffer 

conditions for elution of selected rna from the filters 

can be used. For example, without limitation 

nondetergent aqueous protein denaturing agents such as 

quanidxnxum chloride, quanidinium thiocyanate, etc., as 

are known in the art, can be used. The specific 

solution used for elution of nucleic acids from the 

^ki^r ^rn''' routinely selected by one of ordinary 
skill xn the art. 

Alternative partitioning protocols for 

l^llT^ '"'^^ Particularly 

proteins, are available to the art. For example 

bxnding and partitioning can be achieved by passage of 
the test nuclexc acid mixture through a column which 

35 ""^"^ ^""^ - «°li<* support 

materxal. Those nucleic acid that bind to the target 
will be retained on the column and unbound nucleic 
acids can be washed from the column. 



30 



Substitute sheet 



wo 91/19813 



PCr/l}S91/04078 



49 

Throughout this application, the SELEX process 
has been defined as an iterative process wherein 
selection and amplification are repeated until a 
desired selectivity has been attained. In one 
5 embodiment of the invention^ the selection process may 

be efficient enough to provide a ligand solution after 
only one separation step. For example, in theory a 
column supporting the target through which the 
candidate mixture is introduced — under the proper 

10 conditions and with a long enough column — should be 

capable of separating nucleic acids based on affinity 
to the target sufficiently to obtain a ligand solution. 
To the extent that the original selection step is 
sufficiently selective to yield a ligand solution after 

15 only one step, such a process would also be included 

within the scope of this invention. 

In one embodiment of this invention, SELEX is 
iteratively performed until a single or a discrete 
small number of nucleic acid ligands remain in the 

20 candidate mixture following amplification. In such 

cases, the ligand solution will be represented as a 
single nucleic acid sequence, and will not include a 
family of sequences having comparable binding 
affinities to the target. 

25 In an alternate embodiment of the invention, 

SELEX iterations are terminated at some point when the 
candidate mixture has been enriched in higher binding 
affinity nucleic acid ligands, but still contains a 
relatively large number of distinct secpiences. This 

30 point can be determined by one of skill in the art by 

periodically analyzing the sequence randomness of the 
bulk candidate mixture, or by assaying bulk affinity to 
the target. 

At this time, SELEX is terminated, and clones 
35 are prepared and sequenced. Of course, there will be 

an almost unlimited number of clones that could be 
sequenced. As seen in the Examples below, however. 



SUBSTITUTE SHEET 



PCr/US91/04078 



50 



after sequencing between 20 and 50 clones It is 
generally possible to detect the most predominant 
sequences and defining characteristics of the ligand 
solution, in a hypothetical example, after cloning 30 
sequences it will be found that 6 sequences are 
identical, while certain sequence portions of 20 of the 
^er sequences are closely related to sequences within 
the ..winning" sequence. Although the most predominant 
sequence may be considered a ligand solution to that 
target xt is often more appropriate to construct or 
describe a ligand solution that consists of a family of 
sequences that includes the common characteristics of 
"any Of the cloned sequences. 

^'l ^^/"^^^ embodiment of this invention, a 
ixgand solution that is represented as a family of 
sequences having a number of defining characteristics 

wh;r;'/cr.'*" ^^^"^ '^<^<^c^. 

Z7t . ^PP-rently be any of the four nucleotides) 

in this embodiment, the candidate mixture would be 
comprised of partially fixed and partially random 
r^e ir-."^^ ''""^ nucleotides being selecte: based 
process, m this manner, if there is a single 
nucleotide sequence that binds better than the other 
members of the ligand solution family, it will be 
quickly identified. 

In an alternate further embodiment of the 
invention, a second SELEx experiment based on the 
li^d solution received in a SELEX process is also 
utilized, in this embodiment, the single most 

ZlTlT "~ ^^--^^<^CAC, is used to 

inform the second SELEX process tt, 

pjTocess. In this second SEiiEx 
process the candidate mixture is prepared in orderT 
yxeld sequences based on the selected winner, while 

e\To7tr'' '"^^^ randomization at 

each of the sequences, i^is candidate mixture may be 



SDbstitute sheet 



wo 91/19813 



PCr/US91/04078 



produced by using nucleotide starting materials that 
are biased rather than randomized. For example, the A 
solution contains 75% A and 25% C and G. Although 
the nucleic acid synthesizer is set to yield the 
5 predominant nucleotide, the presence of the other 

nucleotides in the A solution will yield nucleic acid 
sequences that are predominant in A but that will also 
yield variations in this position. Again, this second 
SELEX round, informed by the results obtained in the 

10 initial SELEX process, will maximize the probabilities 

of obtaining the best ligand solution to a given 
target. Again, it must be clarified that the ligand 
solution may consist of a single preferred nucleic acid 
ligand, or it may consist of a family of structurally 

15 related sequences with essentially similar binding 

affinities. 

In practice, it may occasionally be preferred 
that the SELEX process not be performed until a single 
sequence is obtained. The SELEX process contains 

20 several bias points that may affect the predominance of 

certain sequences in a candidate mixture after several 
roTinds of SELEX that are not related to the binding 
affinity of that sequence to the target. For example, 
a bias for or against certain sequences may occur 

25 during the production of cDNA from the RNA recovered 

after selection, or during the amplification process. 
The effects of such unpredictable biases can be 
minimized by halting SELEX prior to the time that only 
one or a small niunber of sequences predominate in the 

30 reaction mixture. 

As stated above, sequence variation in the test 
nucleic acid mixture can be achieved or increased by 
mutation. For example, a procedure has been described 
for efficiently mutagenizing nucleic acid secjuences 

35 during PCR amplification (Leung et al. 1989). This 

method or functionally equivalent methods can 
optionally be combined with amplification procedures in 



Substitute sheet 



wo 91/19813 



PCr/US9t/IM078 



10 



15 



20 



25 



30 



35 



52 

the present invention. 

Alternatively conventional methods of DNA 
mutagenesis can be incorporated into the nucleic acid 
amplxfxcation procedure. Applicable mutagenesis 
procedures include, among others, chemically induced 
mutagenesis and oligonucleotide site-directed 
mutagenesis. 

The present invention can also be extended to 
utxlxze additional interesting capacities of nucleic 
acxds and the manner in which they are known or will 
later be found to interact with targets such as 
proteins. For example, a SELEX methodology may be 
employed to screen for ligands that form Michael 
adducts with proteins. Pyrimidines, when they sit in 
the correct place within a protein, usually adjacent to 

lll luT °' ^'^^^ nucleophile, can react 

wxth that nucleophile to form a Michael adduct. The 
mechanism by which Michael adducts are formed imrolves 
a nucleophilic attack at the 6 position of the 
pyrimidine base to create a transient (but slowly 
reversing) intermediate that is really a 5 6 
dihydropyrimidine. it is possible to test'for the 
presence of such intermediates by observing whether 
binding between an rma and a protein target occurs even 
after the protexn is denatured with any appropriate 
denaturant. That is, one searches for a continued 
covalent interaction when the binding pocket of the 
target has been destroyed. However, Michael adducts 
are Often reversible, and sometimes so quickly that the 
failure to identify a Michael adduct through this test 
mZnt P^---^ a prior 

SELEX may be done so as to take advantage of 
Mxchael adduct formation in order to create very high 
affxnxty, near-suicide substrates for an enzyme or 
other protein target, imagine that after binding 
between a randomized mixture of rnas and the target 



SObstitute sheet 



wo 91/19813 



PCr/US91/04078 



53 

prior to partitioning on a filter or by other means, 
the target is denatured. Subsequent partitioning, 
followed by reversal of the Michael adduct and cDNA 
synthesis on the released RNA, followed by the rest of 
5 the SELEX cycle, will enrich for RNAs that bind to a 

target prior to denaturation but continue to bind 
covalently until the Michael adduct is reversed by the 
scientist. This ligand, in vivo , would have the 
property of permanently inhibiting the target protein. 
10 The protein tRNA-uracil methyl transferase (RDMT) binds 

substrate tRNAs through a Michael adduct. When RUMT is 
expressed at high levels in E. coli the enzyme is found 
largely covalently bound to RNA, suggesting strongly 
that nearly irreversible inhibitors can be found 
15 through SELEX. 

The method of the present invention has 
multiple applications. The method can be employed, for 
example, to assist in the identification and 
characterization of any protein binding site for DNA or 
20 RNA. Such binding sites function in transcriptional or 

translational regulation of gene expression, for 
example as binding sites for transcriptional activators 
or repressors, transcription complexes at promoter 
sites, replication accessory proteins and DNA 
25 polymerases at or near origins of replication and 

ribosomes and translational repressors at ribosome 
binding sites. Sequence information of such binding 
sites can be used to isolate and identify regulatory 
regions bypassing more labor intensive methods of 
30 characterization of such regions. Isolated DNA 

regulatory regions can be employed, for example, in 
heterologous constructs to selectively alter gene 
expression. 

It is an important and unexpected aspect of the 
35 present invention that the methods described herein can 

be employed to identify, isolate or produce nucleic 
acid molecules which will bind specifically to any 



SUBSTITUTE SHEET 



PCT/US91/04078 



54 



desired target H.olecule. Thus, the present methods can 
be employed to produce nucleic acids specific for 
bxnding to a particular target. Such a nucleic acid 
ixgand in a number of ways functionally resembles an 
antibody. Nucleic acid ligax^ds which have binding 
functions similar to those of antibodies can be 
osolated by^the methods of the present invention. Such 
nuclexc acxd ligands are designated herein nucleic acid 
antxbodies and are generally useful in applications in 
whxch polyclonal or monoclonal antibodies have found 
application. Nucleic acid antibodies can in general be 
substituted for antibodies in any in ^ or in vivo 
application, it is only necessary that under tte 
conditions in which the nucleic acid antibody is 

Tdl^!; ""'^ substantially resistant 

to degradation. Applications of nucleic acid 
antibodies include the specific, qualitative or 
quantitative detection of target molecules from any 
source; purification of target molecules based on Leir 
specific binding to the nucleic acid; and various 
therapeutic methods which rely on the specific 
direction of a toxin or other therapeutic agent to a 
specific target site. 

can . "^^T "^^^^'^^^ ^« preferably proteins, but 
can also include among others carbohydrates 
peptidoglycans and a variety of small molecules. As 
with conventional proteinaceous antibodies, nucleic 
acid antibodies can be employed to target biological 
structures, such as cell surfaces or viruses, tLough 
specific interaction with a molecule that is an 
integral part of that biological structure. Nucleic 
acid antibodies are advantageous in that they are not 
ixmited by self tolerance, as are conventional 
antibodies. Also nucleic acid antibodies do not 
require animals or cell cultures for synthesis or 
production, since SELEX is a wholly ^ Yitro process 
As IS well-known, nucleic acids can bind to 



SUBSTITUTE SHEET 



wo 91/19813 



PCTAJS91/04078 



coiaplemen1:ary nucleic acid sequences. This propeirty of 
nucleic acids has been extensively utilized for the 
detection, quantitation and isolation of nucleic acid 
molecules. Thus, the methods of the present invention 
5 are not intended to encompass these well-3aiown binding 

capabilities between nucleic acids. Specifically, the 
methods of the present invention related to the use of 
nucleic acid antibodies are not intended to encompass 
known binding affinities between nucleic acid 
10 molecules* A number of proteins are known to function 

via binding to nucleic sequences, such as regulatory 
proteins which bind to nucleic acid operator sequences. 
The known ability of certain nucleic acid binding 
proteins to bind to their natural sites, for example, 
15 has been employed in the detection, quantitation, 

isolation and purification of such proteins. The 
methods of the present invention related to the use of 
nucleic acid antibodies are not intended to encompass 
the known binding affinity between nucleic acid binding 
20 proteins and nucleic acid sequences to which they are 

known to bind. However, novel, non-naturally-occurring 
sequences which bind to the same nucleic acid binding 
proteins can be developed using SELEX. it should be 
noted that SELEX allows very rapid determination of 
25 nucleic acid sequences that will bind to a protein and, 

thus, can be readily employed to determine the 
structure of unknown operator and binding site 
sequences which sequences can then be ^ployed for 
applications as described herein. It is believed that 
30 the present invention is the first disclosure of the 

general use of nucleic acid molecules for the 
detection, quantitation, isolation and purification of 
proteins which are not known to bind nucleic acids. As 
will be discussed below, certain nucleic acid 
35 antibodies isolatable by SELEX can also be employed to 

affect the function, for example inhibit, enhance or 
activate the function, of specific target molecules or 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



56 



10 



15 



20 



25 



30 



35 



structures. Specifically, nucleic acid antibodies can 
be employed to inhibit, enhance or activate the 
function of proteins. 

Proteins that have a known capacity to bind 
nucleic acids (such as DNA polymerases, other 
replicases, and proteins that recognize sites on RNA 
but do not engage in further catalytic action) yield 
via SEIEC, high affinity RNA ligands that bind to the 
active site of the target protein. Thus, in the case 
of HIV-l reverse transcriptase the resultant RNA ligand 
(called 1.1 in Example 2) blocks cDNA synthesis in the 
presence of a primer DNA, an RNA template, and the four 
deoxynucleotide triphosphates. 

The inventors theory of RNA structures suggests 
that nearly every protein will serve as a target for 
SELEX. The initial experiments against non-nucleic 
acid binding protein were performed with three proteins 
not thought to interact with nucleic acids in general 
or RNA in particular. The three proteins were tissue 
plasminogen activator (tPA) , nerve growth factor (NGP) 
and the extracellular domain of the growth factor 
receptor (gfR-Xtra) . All of these proteins were tested 
to see if they would retain mixed randomized RNAs on a 
nitrocellulose filter. tPA and NGP showed affinity for 
randomized RNA, with Kd's just below uM. gfR-xtra did 
not bind with measurable affinity, suggesting that if 
an RNA antibody exists for that protein it must bind to 
a sxte that has no affinity for most other rnas. 

tPA and N6F were taken through the SELEX drill 
using RNAS with 30 randomized positions. Both tPA and 
NGF gave ligand solutions in the SELEX drill 
suggesting that some site on each protein boLd the 
winning sequences more tightly than that site (or 
another site) bound other RNAs. The winning sequences 
are different for the two proteins. 

Since tPA and NGF worked so well in the SELEX 
drill, a random collection of proteins and peptides 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



57 

were tested to see if they had any affinity for RNA. 
It was reasoned that if a protein has any affinity for 
RNA that the SELEX drill will, on the average, yield 
higher affinity sequences which contact the same region 
5 of the target that provides the low, generalized 

affinity. A set of proteins and peptides, were tested 
to see if randomized RNAs (containing 40 randomized 
positions) would be retained on nitrocellulose filters. 
About two thirds of the proteins tested bound RNA, and 
10 a few proteins bound RNA very tightly. See Example 9. 

Proteins that do not bind RNA to nitrocellulose 
filters may fail for trivial reasons having nothing to 
do with the likelihood of raising RNA antibodies. One 
example, bradykinin, fails to bind to nitrocellulose 
15 filters, and thus would fail in the above experiment. 

A bradykinin linked to a solid matrix through the amino 
terminus of the peptide was prepared, and then found 
that randomized RNA bound tightly to the matrix (see 
Example 7) . Thus in the initial experiments two short 
20 peptides, bradykinin and bombesin, bind randomized RNAs 

quite tightly. Any high affinity RNA ligand obtained 
through SEI*EX with these peptide targets would, 
perhaps, be an antagonist of these active peptides, and 
might be useful therapeutically. It is difficult to 
25 imagine an RNA of about 30 nucleotides binding to a 

very small peptide without rendering that peptide 
inactive for virtually any activity. 

As described in Exeuoples 4, 7, 9 and 10 below, 
proteins not thought to interact with nucleic acids in 
30 nature were found to bind a random mixture of nucleic 

acids to a non-trivial extent. It has further been 
shown that for such proteins that were found to bind 
RNA mixtures non-specif ically that a ligand solution 
can be obtained following SELEX. It is, therefore, a 
35 potentially valuable screen — prior to the performance 

of SELEX — to determine if a given target shows any 
binding to a random mixture of nucleic acids. 



Substitute sheet 



PCT/US91/04078 



58 



It is a second important and unexpected aspect 
Of the present invention that the methods described 
herexn can be employed to identify, isolate or produce 
nucleic acid molecules which will bind specifically to 
a particular target molecule and affect the function of 
that molecule, m this aspect, the target molecules 
are again preferably proteins, but can also include 
among others, carbohydrates and various small molecules 
to whxch specific nucleic acid binding can be achieved. 
Nuclexc acid ligands that bind to small molecules can 
affect their function by sequestering them or by 
preventing them from interacting with their natural 

a«!^f ; °" ^"-y™- be 

affected by a nucleic acid ligand that binds the 

enzyme's substrate. Nucleic acid ligands, i.e., 
nucleic acid antibodies, of small molecules are 
particularly useful as reagents for diagnostic tests, 
(or other quantitative assays) . For example, the 
presence of controlled substances, bound metabolites or 
abnormal quantities of normal metabolites can be 
detected and measured using nucleic acid ligands of the • 
invention, a nucleic acid ligand having catalytic 
activity can affect the function of a small molecule by 
catalyzing a chemical change in the target. The range 
Of possible catalytic activities is at least as broad 
as that displayed by proteins. The strategy of 
selecting a ligand for a transition state analog of a 
desired reaction is one method by which catalytic 
nucleic acid ligands can be selected. 

" believed that the present invention for 
the first time discloses the general use of nucleic 
acid molecules to effect, inhibit or enhance protein 
function. The binding selection methods of the present 
invention can be readily combined with secondary 
selection or screening methods for modifying target 
molecule function on binding to selected nucleic acids 
The large population of variant nucleic acid sequences' 



IObstitute sheet 



wo 91/19813 



PCr/US91/04078 



59 

that: can be tes'ted by SELEX enhances the probability 
that nucleic acid sequences can be found that have a 
desired binding capability and function to modify 
target molecule activity. The methods of the present 
5 invention are useful for selecting nucleic acid ligemds 

which can selectively affect function of any target 
protein including proteins which bind nucleic acids as 
part of their natural biological activity and those 
which are not known to bind nucleic acid as part of 
10 their biological ftmction. The methods described 

herein can be employed to isolate or produce nucleic 
acid ligands which bind to and modify the function of 
any protein which binds a nucleic acid, either DNA or 
RNA, either single-stranded or double-stranded; a 

15 nucleoside or nucleotide including those having purine 

or pyrimidine bases or bases derived therefrom, 
specifically including those having adenine, thymine, 
guanine, uracil, cytosine and hypoxanthine bases and 
derivatives, particularly methylated derivatives, 

20 thereof; and coenzyme nucleotides including €Uiiong 

others nicotinamide nucleotides, f lavin-adenine 
dinucleotides and coenzyme A. It is contemplated that 
the method of the present invention can be employed to 
identify, isolate or produce nucleic acid molecules 

25 which will affect catalytic activity of target enzymes, 

i.e., inhibit catalysis or modify substrate binding, 
affect the functionality of protein receptors, i.e., 
inhibit binding to receptors or modify the specificity 
of binding to receptors; affect the formation of 

30 protein multimers, i.e., disrupt cpiatemary structure 

of protein subunits; and modify transport properties of 
protein, i.e., disrupt transport of small molecules or 
ions by proteins. 

The SELEX process is defined herein as the 

35 iterative selection and amplification of a candidate 

mixture of nucleic acid sequences repeated until a 
ligand solution has been obtained. A further step in 



Substitute sheet 



wo 91/19813 



PCrA)S91/04078 



60 



the process is the production of nucleic acid 
antibodies to a given target. Even when the ligand 
solution derived for a given process is a single 
sequence, the nucleic acid antibody containing just the 
llrT synthesized. Per example, a 

SELEX experiment may give a preferred single ligand 
solution that consists of only 20 of the 30 randomized 
nucleotide sequences used in the SEtEX candidate 
mixture. The therapeutically valuable nucleic acid 
10 antibody would not, preferably, contain the 10 non- 

crxtical nucleotides or the fixed sequences required 
for the amplification step of SELEX. Once the desired 
structure of the nucleic acid antibody is determined 
based on the ligand solution, the actual synthesis of 
the nucleic acid antibody will be performed according 
to a variety of techniques well known in the art 

The nucleic acid antibody may also be 
constructed based on a ligand solution for a given 
target that consists of a family of sequences, m such 
case, routine experimentation will show that a given 
sequence is preferred due to circumstances unrelated to 
the relative affinity of the ligand solution to the 
target. Such considerations would be obvious to one of 
ordinary skill in the art. 

. alternate embodiment of the present 

invention, the nucleic acid antibody may contain a 
plurality of nucleic acid ligands to the same target 
For example, SELEX may identify two discrete ligand * 
solutions. AS the two ligand solutions may bind the 
30 target at different locations, the nucleic acid 

antibody may preferably contain both ligand solutions, 
in another embodiment, the nucleic acid antibody may 
contain more than one of a single ligand solution 
such multivalent nucleic acid antibodies will have 
increased binding affinity to the target unavailable to 
an equivalent nucleic acid antibody having only one 
ligand. 



SUBSTITUTE SHEET 



wo 91/19813 



PCT/US91/04078 



61 

In addition, the nucleic acid antibody may also 
contain other elements, that will l) add independent 
affinity for the target to the nucleic acid antibody; 
2) dependent ly enhance the affinity of the nucleic acid 
5 ligand to the target; 3) direct or localize the nucleic 

acid antibody to the proper location in vivo where 
treatment is desired; or 4) utilize the specif ity of 
the nucleic acid ligand to the target to effect some 
additional reaction at that location. 
10 The methods of the present invention are useful 

for obtaining nucleic acids which will inhibit function 
of a target protein, and are particularly useful for 
obtaining nucleic acids which inhibit the function of 
proteins whose function involves binding to nucleic 
15 acid, nucleotides, nucleosides and derivatives and 

analogs thereof. The methods of the present invention 
can provide nucleic acid inhibitors, for example, of 
polymerases, reverse transcriptases, and other enzymes 
in which a nucleic acid, nucleotide or nucleoside is a 
20 substrate or co- factor. 

Secondary selection methods that can be 
combined with SELEX include among others selections or 
screens for enzyme inhibition, alteration of substrate 
binding, loss of functionality, disruption of 
25 structure, etc. Those of ordinary skill in the art are 

ahlQ to select among various alternatives those 
selection or screening methods that are compatible with 
the methods described herein. 

It will be readily apparent to those of skill 
30 in the art that in some cases, i.e., for certain target 

molecules or for certain applications, it may be 
preferred to employ RNA molecules in preference to DNA 
molecules as ligands, while in other cases DNA ligands 
may be preferred to RNA. 
35 The selection methods of the present invention 

can also be employed to select nucleic acids which bind 
specifically to a molecular complex, for example to a 



SUBSTITUTE SHEEl 



WOSH/19813 



PCr/US91/04078 



62 



substrate/protein or inhibitor/protein complex. Among 
those nucleic acids that bind specifically to the 
complex molec^ules, but not the uncompleted molecules 
there are nucleic acids which will inhibit the 
5 formation of the complex. For example, among those 

nucleic acids ligands irtiich are selected for specific 
binding to a substrate/enzyme complex there are nucleic 
acids which can be readily selected which will inhibit 
substrate binding to the enzyme and thus inhibit or 
10 disrupt catalysis by the enzyme. 

An embodiment of the present invention, which 
is particularly useful for the identification or 
isolation of nucleic acids which bind to a particular 
functional or active site in a protein, or other target 
15 molecule, en^loys a molecule known, or selected, for 

binding to a desired site within the target protein to 
direct the selection/amplification process to a subset 
of nucleic acid ligands that bind at or near the 
desired site within the target molecule, in a simple 
20 example, a nucleic acid sequence known to bind to a 

desired site in a target molecule is incorporated near 
the randomized region of all nucleic acids being tested 
for binding. SELEX is then used (Pig. 9) to select 
those variants, all of which will contain the known 
25 binding sequence, which bind most strongly to the 

target molecule, a longer binding sequence, which is 
anticipated to either bind more strongly to the target 
molecule or more specifically to the target can thus be 
selected. The longer binding sequence can then be 
»0 introduced near the randomized region of the nucleic 

acid test mixture and the selection/amplification steps 
repeated to select an even longer binding sequence. 
Iteration of these steps (i.e., incorporation of 
selected sequence into test mixtures followed by 
5 selection/amplification for improved or more specific 

binding) can be repeated until a desired level of 
binding strength or specificity is achieved. This 



SUBSTITUTE SKEE7 



wo 91/19813 



PCr/US91/04078 



63 

iterative "walking" procediire allows the selection of 
nucleic acids highly specific for a particular target 
molecule or site within a target molecule. Another 
embodiment of such an iterative "walking" procedure, 
5 employs an "anchor" molecule which is not necessarily a 

nucleic acid (see Figs. 10 and 11). In this embodiment 
a molecule which binds to a desired target, for example 
a substrate or inhibitor of a target enzyme , is 
chemically modified such that it can be covalently 
10 linked to an oligonucleotide of known sequence (the 

"guide oligonucleotide" of Fig. 10) . The guide 
oligonucleotide chemically linked to the "anchor" 
molecule that binds to the target also binds to the 
target molecule. The sequence complement of guide 
15 oligonucleotide is incorporated near the randomized 

region of the test nucleic acid mixtiire. SEUEX is then 
performed to select for those sequences that bind most 
strongly to the target molecule/anchor complex. The 
iterative walking procedure can then be employed to 
20 select or produce longer and longer nucleic acid 

molecules with enhanced strength of binding or 
specif ity of binding to the target. The use of the 
"anchor" procedure is expected to allow more rapid 
isolation of nucleic acid ligands that bind at or near 
25 a desired site within a target molecule. In 

particular, it is expected that the "anchor" method in 
combination with iterative "walking" procedures will 
result in nucleic acids which are highly specific 
inhibitors of protein function (Fig. 11) . 
30 In certain embodiments of the performance of 

SEIiBX it is desireable to perform plus/minus screening 
in conjunction with the selection process to assure 
that the selection process is not being skewed by some 
factor unrelated to the affinity of the nucleic acid 
35 sequences to the target. For example, when selection 

is performed by protein binding nitrocellulose, it has 
been seen that certain nucleic acid sequences are 



SUBSTITUTE SHEE7 



fci/vsnmwn 



preferentially retslnea by iatrocll„, 

selected during the SEI^ p^o^ """^ ■=^" 

be re^ed tron the <:^^^^^^T 

a*Jitio™,l steps Wherein ttT ^ l>>«>rporating 

Pa..«. through'nitrrn^ui*::/^:!^'^^^ 1. 

S»ch soreealBa .M for that property. 

-rget'^ts'^^r^'-r """-^ 
introduce, hiase. »refat:d .rCtrt:^""/'^""'"" 

- isoiirnr/r^rr-^- ^"uiirtitr- 

polymerase, al^~ ZT'TT " 

W Oa Poly«r„. i. usel^ as T °' 

for T4 »« pcly^ra..^ slL 

Pol^era^e is .«tc«,o„slTrS^tr 1" 

Of functional protein, aaLr^!^: 

proteins are overe^^^.rT^. """^t 

synthesis o, ^Id!^:::^" "^"' *° ^ 

infection. ,R„ssel 1^,3, T L r""""""-^'""^"* 

SlSta translation of Z i I' ? to 

specifically repreTIed^-:^" « ^= ^ 

a«, ,p,3 protect. I Ssc^,""" f ^"^"^"^ 
near its rihosce binding sitH^ °' 
(todrajce et al. (198./^ t nuclease attack 

translation^ 'op^jr: "^-^^^ « the SH. 
tr«^ Of thS^^ ""'^ «nd the ■ 

-miT'si.ertrs:: r:::,":-!^"^""-"- 

about 3. nucleotides'as 11"^^^' ^^TT 

IS predicted to have a h«4^- , ^' ^' "^^^^^ 

indicated therein. M'^^'Z.TT^ 

was determined by analv^,-« ^ ^. ^® operator 

^y-ol.sis .ra J^ol^^te'tr^^^ ^^"^^^^^ 

Of bindin, o. operator ^utantT^rL L!^," T'^'^ 

sequence indicate that ap43 ki^h ^^^^in and loop 

sensitive to primary baf . "^ i« 
primary base changes in the helix 



SUBSTITUTE SHEEl 



wo 91/19813 PCr/US91/04078 

65 . 

Binding to the polymerase was even more reduced by 
changes which significantly reduce hairpin stability. 
Operator binding was found to be very sensitive to loop 
sequence* It was found that replication and operator 
5 binding in gp43 are mutually exclusive activities. The 

addition of micromolar amounts of purified RNAs 
containing intact operator was found to strongly 
inhibit In vitro replication by gp43. 

The wild-type gp43 operator. Fig. 1, was 
10 employed as the basis for the design of an initial 

mixture of RNA molecules containing a randomized 
sequence region to assess the ability of the 
selection/amplification process to isolate nucleic acid 
molecules that bind to a protein. The RNA test mixture 
15 was prepared by in vitro transcription from a 110 base 

single-stranded DNA template. The template was 
constructed as illustrated in Figure 1 to encode most 
of the wild-type operator sequence, except for the loop 
sequence. The eight base loop sequence was replaced by 
20 a randomized sequence region which was synthesized to 

be fully random at each base. The template also 
contained sequences necessary for efficient 
amplification: a sequence at its 3* end complementarily 
to a primer for reverse transcription and amplification 
25 in polymerase chain reactions and a secpience in its 5' 

end required for T7 RNA polymerase transcriptional 
initiation and sufficient sequence complementary to the 
cDNA of the in vitro transcript. The DNA template is 
this a mixture of all loop sequence variants, 
30 theoretically containing 65,536 individual species. 

The dissociation constant for the wild-type 
loop RNA was found to be about 5 x lO'^M. The 
dissociation constant for the population of loop 
sequence variants was measured to be about 2.5 x 10"^. 
35 Randomization of the loop sequence lowered binding 

affinity 50-fold. 

In vitro transcripts containing the loop 



SUBSTITUTE SHEEl 



PCTAJS91/04078 

66 



sequence varxants were niixed with purified gp43 and 
xncubated. The .i^ure was filtered througra 
nxtrocellulose filter. Protein-RNA complexes are 
retained on the filter and unbound RNA is not. 
selected RNA was then eluted fro» the filters as 
descri^. in Example l. selected HNAs were extLed 
With AMV reverse transcriptase in the presence of 3. 

res'l'i^: l^"^^^ ""^"^^ ^"«'> The 

resulting cDNA was amplified with 2ag DNA polymerase in 
the presence of the 5- primer for 30 cycles af 
described in innis et al. (i9S6, ^3 3,,^^^ 

amplified DNA served as a template for in zitra 
transcription to produce selected amplified rma 
transcripts which were then subject to another round of 

rs: -vprotei:^.:: 

xn the binding selection mixture was held constant 
throughout the cycles of selection. The iterative 
selection/amplification was performed using several 
different PNA/protein molar ratios, xn all e^eri^ents 
^1^/1 ^'^^^--^ ^ -ployed an RWgp43 

^yZ J l T' ^^--^ B employed an 

RNA/gp43 of 1000/1; and experiment c employed an 
RNA/gp43 of loo/i. P-^oyed an 

monli- '^1^''°^^^^ selection process was 

monitored by filter binding assays of labelled 

cycrn^'th °' completion of each 

cycle Of the procedure. Batch sequencing of the RNA 
products from each round for experiment B was alsfdone 
to monitor the progress of the selection. 
rT^HT"" r ^^^-^-^ ^els Of RNA products after 

It . IS Clear that there was no apparent 
loop sequence bias intr«tacecl ™tll aft„ the Lrd 
selection. After the fc„^ ^ J 
apparent consensus sequence for th. eight base loop 
seance .s dxscemable as: A<a/g, (u/c,A*c(u/c, (u/c, 
Batch seguencxng of selected KH. after the fo^ i'J^ 



SUBSTITUTE SHEEl 



wo 91/19813 



pcr/us9i/a4a78 



67 

of selection for experiments A, B and C is compared in. 
Figure 4* All three independent SEliEX procedures using 
different RNA/protein ratios gave similar apparent 
consensus sequences. There was, however, some apparent 
5 bias for wild-type loop sec[uence (AAUAACXJC) in the 

selected RNA from experiments A and C. 

In order to determine what allowable sequence 
combinations were actually present in the selected 
RNAs, individual DNAs were cloned from selected RNAs 

10 after the fourth round of selection in experiment B. 

The batch sequence result from experiment B appeared to 
indicate an even distribution of the two allowable 
nucleotides which composed each of the four varieible 
positions of the loop sequence. Individuals were 

15 cloned into pUC18 as described by Sambrook, J. et al. 

(1989) Molecular Cloning : A Laboratory Manual, <Cold 
Spring Harbor, N*Y.)r Sections 1.13? 1.85-1.86, Twenty 
individual clones that were identified by colony filter 
hybridization to the 3» primer were sequenced. None of 

20' the secpienced clones were mutant at any place in the 

operator secjuence outside of the loop sequence* Only 
five variant sequences were observed as shown in Figure 
7, and surprisingly only two secpxence variants were the 
major components of the selected mixture. The 

25 frequencies of each secpience in the 20 individual 

isolates sequenced eure also given in Figure 7. The 
wild-type sequence AAUAACUC and the loop AGCAACCU were 
present in approximately equal amotmt in the selected 
RNA of experiment B. The other selected variants were 

30 1 base mutants of the two major variants. The strength 

of binding of the sequence variants was compared in 
filter binding assays using labelled in vitro 
transcripts derived from each of the purified clonal 
isolates. As shown in Figure 6, a rough correlation 

35 between binding affinity of an RNA for gp43 and the 

abundance of the selected sequence was observed. The 
two major loop sequence variants showed approximately 



SUBSTITUTE SHEE1 



10 



15 



WO 91/19813 

PCr/US91/04078 

68 

equal binding affinities for gp43. 

The loop sequence variant RNAs isolated by the 
selection/amplification process, shown in Figure 7, can 
all act as inhibitors of gp43 polymerase activity as 
has been demonstrated for the wild-type operator 
sequence. 

An example of the use of SELEX has been 
provided by selection of a novel RNA ligand of 
bacteriophage T4 DMA polymerase (gp43) (Andrake et al. 
(1988) Proc. Natl. Acad. Sci. USA 85:7942-7946). 

The present invention includes specific ligand 
solutions, derived via the SELEX process, that are 
shown to have an increased affinity to HIV-l reverse 
transcriptase, R17 coat protein, Hiv-i rev protein, HSV 
DNA polymerase, E. coli ribosomal protein Si, tPA and 
NGF. These ligand solutions can be utilized by one of 
skill in the art to synthesize nucleic acid antibodies 
to the various targets. 

The following examples describe the successful 
application of SELEX to a wide variety of targets. The 
targets may generally be divided into two 
categories— those that are nucleic acid binding 
proteins and those proteins not known to interact with 
nucleic acids, m each case a ligand solution is 
25 obtained, in some cases it is possible to represent 

the ligand solution as a nucleic acid motif such as a 
haxrpin loop, an asymmetric bulge or a pseudoknot. in 
other examples the ligand solution is presented as a 
primary sequence, m such cases it is not meant to be 
implied that the ligand solution does not contain a 
definitive tertiary structure. 

in addition to T4 DNA polymerase, targets on 
which SELEX has been successfully performed include 
bacteriophage R17 coat protein, HIV reverse 
transcriptase (HIV-RT) , HIV-i £ev protein, HSV DNA 
polymerase plus or minus cofactor, E. anM ribosomal 
protein Si, tPA and NGP. The following experiments 



20 



30 



35 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04078 



69 

also describe a protocol for testing the bulk binding 
affinity of a randomized nucleic acid candidate mixture 
to a variety of proteins. Excimple 7 also describes the 
immobilization of bradykinin and the results of bulk 
5 randomized nucleic acid binding studies on bradykinin. 

The examples and illustrations herein are not 
to be taken as limiting in any way. The fundamental 
insight underlying the present invention is that 
nucleic acids as chemical compounds can form a 
10 virtually limitless variety of sizes, shapes and 

configurations and are capable of an enormous 
repertoire of binding and catalytic functions, of which 
those known to exist in biological systems are merely a 
glimpse. 

15 

EXAMPLES 

The following materials and methods were used 
throughout. 

The transcription vector pT7-2 is commercially 

20 available (U.S. Biochemical Company, Clevelemd, OH). 

Plasmid pUClS is described by Norrander et al. (1983) 
Gene 2^:15-27 and is also commercially available from 
New England Biolaibs. All manipulations of DNA to 
create new recombinant plasmids were as descrdLbed in 

25 Maniatis et al. (1982) Molecular Cloning; A I^boratorv 

Manual . Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, except as otherwise noted. DNA 
olignucleotides were synthesized and purified as 
described in Gauss et al. (1987) Mol. Gen. Genet. 

30 206:24-34. 

In vitro transcriptions with T7 RNA polymerase 
and RNA gel-purification were performed as described in 
Mllligan et al. (1987) Nucl. Acids Res. 15:8783-8798, 
except that in labeling reactions the concentrations of 

35 ATP, CTP, and GTP were 0.5 mM each, and the UTP 

concentration was 0.05 mM. The UTP was labeled at the 
alpha position with ^P at a specific activity of 



SUBSTITUTE RMPEl 



wo 91/19813 



PCr/US91/a4078 



Gauss et al. (1,37) sSEC^ "ccordln, to 

ITls-BOl pH 7.7 at 4-c In ™ »«tate, 50 

M«U„, purine, ^-^nr;""^"- 

30 aUquots Of each 01^,1- '^ilutea and 

- 30 „x au^ots o^ts::: ™ « 

™A- The RKA dilution (50 ' '^'-^•"^"i^^ 

nitrocellulose filter T °" » 

input counts per t^' *° ^-t^n> 

^ reactions ranged fror o°:~r^ ^/-^^ In 
ooncMitratlon of the • 1° » and the 

approximately lo-' « ^ °* ^«^^ant was 
-mutes, each tuhe »s ^^"T^T^ " ^ 
50 »1 Of each sa>.ple filtered *k "^""=~ 
nitrocellulose filters J^ff 

with 3 ^ Of a"o ^ r" "»> ""^ 

«ool»«™ .clntlliatlon fi*!r,xr ""'"^ ^ 
controls were don. iTt^ ^ "™ =^'=-^i<=als. inc.,. 
the background (mTjT" °' '^'^ 
counts, was deterj^^ I^T" °' 
the ^c^„u„, J;^^^ ^ sat Of ^sure^ts 

input counts re»aini„, on T.' tTt 

aach set of data points, a^L^flt r,""'""'^- ^ 
"l-olacular Mndln, cur^e w^!" 

Of a publlrt>ed prcLaTTLr . "^ « ™rslon 

-odlfled to^" "34 
aquation, ^cnatiuct a curve described by the 

•hareS-is the fractltn'of**^.''^"" * 

to the filter, a Ts lL ^ 

aaturate. (approxLatery'roTtt"^ " ""^ 
interaction,, [,543, . Proteln-H« 

X, the xnput 9P43 concentration. 



SUBSTITUTE SHEEl 



wo 91/19813 



FCT/US91/CM078 



71 

and Kd is the dissociation constant for the bimolec^ular 
reaction. This equation is an algebraic rearrangement 
of equation fi-^] from Bisswanger (1979) Theorie und 
Methoden der Enzvmkinetik , Verlag Chemie^ Weinheim, 
5 FRG, p. 9 with the simplifying assumption that the 

concentration of the protein far exceeds the 
concentration of RNA-protein complexes, an assumption 
which is valid in the experiments described. 



10 Example 1 , Selection of RNA Inhibitors of T4 DNA 

Polymerase 

A 110 base single-stranded DNA template for in 
vitro transcription was created as shown in Figure 2 by 
ligation of three synthetic oligonucleotides (Tables l, 

15 3,4 and 5) in the presence of two capping 

oligonucleotides (Tables 1 and 2). One of the 
template-creating oligos was also used as the 3 ' primer 
in reverse transcription of the Jjq vitro transcript and 
subsec[uent cunplif ication in polymerase chain reactions 

20 (PCRs) (Innis et al. (1988) Proc, Natl. Acad. Sci. USA 

85:9436-9440). One of the capping oligos (1) contains 
the information required for T7 RNA polymerase 
transcriptional initiation and sufficient sequence 
complementarily to the cDNA of the ^ vitro transcript 

25 to serve as the 5' primer in the PGR amplification 

steps. The DNA template encoded an RNA which contains 
the entire RNA recognition site for T4 DNA polymerase 
except that a completely random sequence was 
substituted in place of the sequence which would encode 

30 the wild-type loop sequence AAUAACUC. The random 

sequence was introduced by conventional chemical 
synthesis using a commercial DNA synthesizer (Applied 
Biosy stems) except that all four dNTP's were present in 
equimolar amounts in the reaction mixture for each 

35 position indicated by N in the sequence of 

oligonucleotide number 4 (Table 1) . The random 
sequence is flanked by primer annealing sequence 



SUBSTITUTE SHEE1 



wo 91/19813 

PCr/US91/04©78 

72 

information for the 5- and 3. 

DNA template is thus a mil ""^^ ^he 

variants, ^^^or.T^^l'^ZZlT 

species. The dissociaLr ' individual 

the population of sequences wa.T " 

X 10- M, a 50.fo!^ lIwLTind™' '° 
xower hindxng affinity. 



10 



20 



25 



30 



35 



^. TABLE 1 

i) 5'-TAATACGACTCACTATAGGGAC 



---ataaactaaggaata^'a^^^ 

PCATAG-3 « 



5> 5.-cTTTC»»i»aae»i3mtt»»ircj^ 



three ai„ere„t sjr^rjr '""'"^ ^" « 

concentration of ^ 3 ^ « the 

ana for c the concentration of L, P"'*"-" 
"Wgh protetn.. por A t^! . «>*^ 3 x lo" «, 
about 3 X xo", now "T*""'™ " 

concentration of a^TtT/.^C ' 

°»a rouna consistea of the fouL^ etr;,:"'* 

37-c, «,ehea through a ^"^t^a at 



SUBSTITUTE SHEEI 



wo 91/19813 



PCr/US91/04078 



73 

presence of 50 picomoles of 3» proiner in a 50 /xl 
reaction under conditions described in Gauss et al. 
(1987) gupya/ To the resulting cDNA synthesis 50 
picomoles of 5' primer was added" and in a reaction 
5 volume of 100 /il and was amplified with Tag DNA 

polymerase as described in Xnnis (1988) supra for 30 
cycles • 



3) yya^scyjtptior>> m vitro transcription is 
io performed on the selected amplified templates as 

described in Milligan et al. (1987) supra , after which 
DNaseX is added to remove the DNA template. The 
resultant selected RNA transcripts were then used in 
step 1 of the next round. Only one-twentieth of the 
15 products created at each step of the cycle were used in 

the subsequent cycles so that the history of the 
selection could be traced. The progress of the 
selection method was monitored by filter binding assays 
of labeled transcripts from each PGR reaction. After 
20 the fourth round of selection and amplification, the 

labeled selected RNA products produced binding to gp43 
equivalent to that of wild-type control RNA. The RNA 
products from each round for one experiment (B) and 
from the fourth round for all three experiments were 
25 gel-purified and sequenced, in Figure 3, we show the 

sequence of the purified in vitro transcripts derived 
from the second, third and fourth rounds of selection 
and amplification for experiment B. It is clear that 
there was no apparent loop sequence bias introduced 
30 until after the third selection. By this point in the 

selection, there was a detectable bias which was 
complete by the fourth round for the apparent consensus 
sequence A(a/g) (u/c) AAC(u/c) (u/c) . Batch sequencing of 
the RNA transcribed after the fourth selection and 
35 amplification for trials A, B, and C is shown in Figure 

4. All three independent runs with different 
protein/RNA ratios gave similar results. There is some 



SUBSTITUTE SHBEl 



PCr/US91/04078 

74 



apparent bias for wlld-<-v«^ 

£=ur -variable. po"l«o^ " °* 

ooM,i™,txo„B actually existed, „e usea two . 
ollgonudaotldes which • ^ "cloning- 

lntor«tlon to ^T. """'^""^ restriction sit. 

fourth r^ Ti. " fro- the 

loortn round ot mtperlient b from whlnt, 

were cloned into pDcl8 a. des^el^f ^-aividuals 

batches o£ trial b w.~ aBEO). The selected 

l^cause there :pp!ar:i\r^~" ««ination 

the two .Xlo.abin:Selid1s°:hiT '^'^^''^ 
the four "variu.le. positio!" T ""^^ « 
Clone, that were ide::^ co^^^f ilT"""' 
hybridization to the 3 • nri»., „ 

these individuals were ^\TZ'^t J'^''^ " 
operator sequence outsid. of the loL 
P=.itions that were deliberft^ vlSer^" 
distributions are summed up i^ pil ^' ""^^ 

th. ..l«:t«l H« mixture Zs^^ZT ^''^^r- 
«3or loop „^ce.. OnT'a^ tte^^rr'""' ^ 

^coc Of Which , out Of 7Zl ZtTT 

other, AGCMccn, was mutant L T 

existed in « of t^ ^r^* Positions and 

other three ^^.T.::^TT,.^''^\:i;,, ^ 
frc each of th^^lTo^,^ transcripts derived 

wa. a rou,h =o^i.t :rL::et':Ld"'°^^" 

- - and selected ^slt^lnr " 

rev.t^» t'rp.'^'-<rl;-i7^ 

compos^Tof ITeteU™'""'^^ « ^3 

w a neterodxmer of two eiiKu^^*. 

that .ave co^on a^^o tenain. 

carboxytenainal region of tho i 

gxon Of the larger peptide comprises 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04a78 



75 

the RNaseH domain of reverse transcriptase; the 
structure of that domain has recently been determined 
at high resolution. 

It has been previously shown that this HIV-1 
5 reverse transcriptase directly and specifically 

interacts with its cognate primer tRNA^^^ to which it 
was experimentally cross-linJced at the anti-codon loop 
and stem. It was also found that only the heterodimer 
exhibited this specific SNA recognition; neither 
10 homodimeric species of reverse transcriptase botmd with 

specificity to this tSNA. 

Two template populations (with approximately 
10^^ different sequences each) were created for use in 
SELEX by ligation. One template population was 
15 randomized over 32 nucleotide positions, using fixed 

sequences at the ends of the randomized region to 
afford cDNA synthesis and PGR amplification. The 
second template population had, as additional fixed 
sequence at the 5' end of the RNA, the anticodon loop 
20 and stem of tRNA^^. (All oligos used in this work are 

shown in TsU:>le 2) . There was no difference in the 
affinity of the two randomized populations for HIV-1 
reverse transcriptase [RT] (and, as is shown, the RNAs 
which were selected did not utilize either 5» region in 
25 specific binding) • Nine rounds of SEIiEX with each 

population were performed using the heterodimer HIV-RT 
as the target protein. 

The mechanism by which the randomized DNA was 
prepared utilizing ligations and bridging 
30 oligonucleotides was described previously. Such 

methodology can diminish the total number of different 
sequences in the starting population from the 
theoretical limit imposed by DNA synthesis at the 1 
micromole scale. 

35 In these ligation reactions about 1 nanomole of 

each oligonucleotide was used. The ligated product was 
gel-purified with an approximate yield of 50%. This 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/(M078 



10 



15 



20 



25 



30 



35 



76 

purified template was transcribed with T7 rna 
polymerase as described above, it was found that HIV 
RT could saturably bind this random population with a 
half-maximal binding occuring at about 7 x ID"' m as 
determined by nitrocellulose assays. All RNA-protein 
binding reactions were done in a binding buffer of 200 
MM KOAc, 50 mM Tris-HCl pH 7.7, 10 mM dithiothreitol . 
RNA and protein dilutions were mixed and stored on ice 
for 30 minutes then transferred to 37-c for 5 minutes. 
(In binding assays the reaction volume is 60 m1 of 
which 50 ul is assayed; in SELEX rounds the reaction 
volume is lOO ul) . Each reaction is suctioned through 
a prewet (with binding buffer) nitrocellulose filter 
and rinsed with 3 mis of binding buffer after which it 
is dried and counted for assays or subjected to elution 
as part of the SELEX protocol. Nine rounds were 
performed. The RNA concentration for all nine rounds 
was approximately 3 x lo*' m. hiv-RT was 2 x lO** M in 
the first selection and 1 x lo'* M in selections 2-9. 

The experiment using rna containing the tRNA'"*'' 
anticodon loop and stem was completed first. 
Nitrocellulose filter binding assays performed at the 
ninth round revealed that the RNA population had 
increased about 100-fold in affinity to HIV-i rt when 
compared to the starting canidate mixture, but that the 
background binding to nitrocellulose filters in the 
absence of protein had increased from about 2% of input 
RNA to 15%. Individual sequences were cloned from this 
population (after filtration through nitrocellulose 
filters to delete some of the high background of 
potential sequences selected for retention by filters 
alone) and are listed in Table 3. Nitrocellulose 
filter binding assays of selected sequences' affinity 
for HIV RT are shown in Figure 14. Some of the 
sequences were selected as ligands for HIV-RT 
exemplified by the binding curves of ligands 1.1 and 
1.3a, and show some sequence homology as illustrated by 



SUBSTITUTE SHEEl 



wo 91/19813 PCTAJS91/a4078 

77 

Tables 4 and 5. Some of 1:he ligand sequences exhibit 
significant retention on nitrocellulose filters in the 
absence of protein^ exemplified by ligand 1,4 (Figure 
14) , and seem to be characterized by a long helix with 
5 a loop of purine repeat elements (as shown in Table 4) . 

In spite of our minimal, late efforts to delete them in 
this experiment prior to cloning, these sequences 
represented a significant part of those collected from 
this experiment. 
10 As a consequence, experiment 2 (which has a 

different 5» fixed sequence) was pre-filtered through 
nitrocellulose before the first, third, sixth and ninth 
rounds of selection. The sequences collected from this 
experiment are shown in Table 6. There are again many 
15 sequences with homology to those of high affinity from 

experiment 1 as shown in Tables 4 and 5. There are 
meuriy fewer, if any, sequences that fit the' motif of 
sequences retained by nitrocellulose filters alone. 
Nitrocellulose binding assays of selected ligand 
20 sequences from this experiment compared to that of 

ligand l.l are shown in Figure 15. 

High affinity ligand RNAs with the most common 
sequence (1.1) and a similar sequence (l-3a) were 
further analyzed to determine the boundaries of the 
25 information required for high affinity binding to HIV-1 

RT. The results of these experiments are shown in 
Figure 16. These experiments establish that the motif 
common to these sequences, UUCC6NNNNNNNNCG6GAAA, are 
similarly positioned within the recognition domain. 
30 The sequences UUCCG and CGGGA of this motif may base- 

pair to form an RNA helix with an eight base loop. In 
order to discover what besides these fixed sequences 
may contribute to high affinity binding to HIV-l RT, a 
canidate mixture template was created that contained 
35 random incorporation at the nucleotide positions that 

differ from these two sequences as shown in Table 7. 
After eight rounds of SELEX, individual sequences were 



SUBSTITUTE SHEEl 



PCr/l}S91/04078 



78 



Cloned and sequenced. The 46 sequences are shown in 
Table 7. Inspection of these sequences reveals 
extensive base-pairing between the central 8n variable 
region and the downstream 4n variable region and 
flanking sequences; base-pairing which in combination 
wxth that discussed above would indicate an rna 
pseudoknot. That no specific sequences predominate in 
thxs evolved population suggests that there is no 
selection at the primary sequence level and that 
selection occurs purely on the basis of secondary 
structure, that is, there are many sequence 
combinations that give similar affinities for HIV-i rt 
and none have competitive advantage. Analysis of the ' 
fx«t and second SELEX experiments reveals that the 
individual sequences which comprise those populations 
that have homology to the DUCCG. . .CGGGANAA motif also 
show a strong potential for this pseudoknot base- 
pairing. 

Figure 31 shows a schematic diagram of what is 
referred to herein as a pseudoknot. a pseudoknot is 
comprised of two helical sections and three loop 
sections. Not all pseudoknots contain all three loops. 
For the purposes of interpreting the data obtained, the 
varxous sections of the pseudoknot have been labeled as 
Shown xn Figure 31. For example, in Table 5 several of 
the sequences obtained in experiments one and two are 
ixsted according to the pseudoknot configuration 
assumed by the various sequences. 

The results of experiments one and two, as 
defxned in Table 5, led to experiment three wherein 
sequences in Sl(a), si(b) and L3 were fixed. Again 
the SEi^X derived nucleic acids were configured al^^st 
exclusxvely in pseudoknots. Examination of the results 
xn each of the experiments reveals that the nucleic 
acxd solution to HIV-rt contains a relatively large 

b^rtr.T'"' ^^^'^ denominator 

bexng that they are all configured as pseudoknots. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



79 

Other generalizations defining the nucleic acid 
solution for HIV-RT are as follows: 

1) si (a) often comprises the sequence 
5»-UUCCG-3* and Sl(b) often comprises the sequence 
5 5*-CGGGA-3*. However^ base pair flips are allowed, and 

the stem may be shortened* 

2} LI may be short or long, but often 
comprises two nucleotides in the best binding nucleic 
acids. The 5* nucleotide in LI often is either a U or 
10 an A. 

3) S2 is usually comprised of 5 or 6 base 
pairs, and appears to be sequence independent. This 
stem may contain non-Watson/Crick pairs. 

4) L2 may be comprised of no nucleotides, but 
15 when it exists, the nucleotides are preferably A»s. 

5) L3 is generally 3 or more nucleotides, 
enriched in A. 

6) In most secpiences obtained by SELEX, the 
total number of nucleotides in LI, S2(a) and L2 equals 

20 8. 

A primary purpose of this experiment was to 
find ligand solutions to HIV-1 RT. The ability of the 
evolved ligand clone 1.1 was compared to the eibility of 
the starting population for experiment 1 to inhibit 

25 reverse transcriptase activity, and is shown in Figure 

17. Even at equal concentrations of inhibitor RNA to 
RT, the reverse transcriptase is significantly 
inhibited by ligand 1.1. In contrast, only at 10 mM 
(or 200-fold excess) starting population RNA is there 

30 any significant inhibition of the HIV-1 RT. Thus, the 

high affinity ligand to HIV-1 RT either blocks or 
directly interacts with the catalytic site of the 
enzyme. 

In order to test the specifity of this 
35 inhibition, various concentrations of ligand l.l were 

assayed for inhibition of MMLV, AMV and HIV-l reverse 
transcriptase. The results of that experiment which 



SUBSTITUTE SHEEl 



wo 91/19813 



"ii 

PCr/OS91/04078 



80 



are shown in Figure 18 show that the inhibition of 
ligand l.l is specific to HIV-l reverse transcriptase. 

Example 3; Isolation of spe cific RWA ligand f,e^i; 
5 bacteriophage Rl7 coat t^TT.^-o-ir) 

SELEX was performed on the bacteriophage R17 
coat protein. The protein was purified as described by 
Carey et aj,. , BiocOiemistry, 22., 2601 (1983) . The 
binding buffer was lOOmM potassium acetate plus 10 nM 

10 dithiothreitol plus 50 mM Tris-acetate pH 7.5. Protein 

and RNA were incubated together for three minutes at 
37*C and then filtered on nitrocellulose filters to 
separate protein-bound RNA from free RNA. The filters 
were washed with 50 mH Tris-acetate pH 7.5. Protein 

15 was at 1.2 x 

lO'^M for the first four rounds of SELEX and at 4 x 10'® 
for rounds five through 11. 

The starting RNA was transcribed from DNA as 
described previously. The DNA sequence includes a 
20 bacteriophage T7 RNA polymerase promoter sequence that 

allows RNA to be synthesized according to standard 
techniques. cDNA synthesis during the amplification 
portion of the SELEX cycle is primed by a DNA of the 
sequence: 

25 cDNA primer (PGR primer 1) : 

5'GTTTCAATAGAGATATAAAATTCTTTCATAG 3» 

The DNA primers used to amplify the cDNA was/ 

thus, the sequence including the T7 promoter, 32 

randomized positions, an AT dinucleotide, and the fixed 
30 sequence complementary to PCR primer 1. The RNA that 

is used to begin the first cycle of SELEX thus has the 

sequence: 

pppG06»GCCAACACCACAAUUCCy«UCAAG-32M-AUCUAUGAAAGAMJUUUAUaK^^ 

A set of clones from after the lith round of 
35 SELEX was obtained and sequenced. Within the 38 

different sequences obtained in the 47 clones were 
three found more than once: one sequence was found six 



SUBSTITUTE SKEEl 



wo 91/19813 



PCr/US91/04078 



81 

times, one sequence four times, and emother two times. 
The remaining 35 sequences were found once each. Two 
sequences were not similar to the others with respect 
to primary sequences or likely secondary structures, 
5 and were not analyzed further. Thirty-six sequences 

had in common the sequence ANCA situated as a 
tetranucleotide loop of a bulged hairpin; the bulged 
nucleotide was an adenine in all 36 cases. The 
sequences of the entire set are given in Tad^le 8, 
10 aligned by the four nucleotides of the hairpin loop. 

The two nucleotides 3» to the randomized portion of the 
starting RNA (an AU) are free to change or be deleted 
since the cDNA primer does not include the 
complementary two nucleotides; many clones have changed 
15 one or both of those nucleotides. 

The winning RNA motif, shown in Figure 19, 
bears a direct relationship to the coat binding site 
identified earlier through site-directed mutagenesis 
and binding studies. See , Uhlenbeck et al . supra 
20 (1983); Ramaniuk et al . supra (1987). However, some of 

the sequences are more conserved in this set than might 
have been expected. The loop sequence AUCA 
predominates, while earlier binding data might have 
suggested that ANCA sequences are all ecjuivalent. The 
25 natural binding site on the R17 genome includes the 

sequence and structure shown below: 

uu 

A A 
GC 

30 GC 

A 

GC 

The natural structure includes the sequence 
35 GGAG, which serves to facilitate ribosome binding and 

initiation of translation of the R17 replicase coding 
region. During SELEX that requirement is not present, 
and the winning sequences contain around the loop and 
bulge C:G base pairs more often than G:C base pairs. 
40 SELEX, therfore, relaxes the constraints of biology and 



SUBSTITUTE SHEEI 



35 



WO 91/19813 

PCr/US91/04078 

82 

xrxnitxes than the natural ligand. similarlv th. 
loop cytldine found 1„ each of the 36 
uridine in the natural site LT % "^^^""^^^ « 

-y lead to disadvantages for the organ^s^ " 

" Pxaatpjfi 4; isolati»Tl »f a n.^ ,.!^!^ ^^^^1 

serin.:, proteap o ^ 

serine proteases are protein enzymes that 
Cleave peptla. bonOs within proteins. Te serine 
proteases are members of a * serine 

tor^etion. Protease! ^ 

axso i^ortant in X^.'ZZTT 
woula ^ targets .or nucieic aci.'T ^.l*^,*:: 
apprc^riate affinities obtained. aoooLing 
invention herein taught. 

aman tissue plasminogen activator rhtPAi 
HSV DHA polymerase experiment ^ " ^' 

^ TTis-at" te .r: r ? " - 

SELPv P" for 3 minutes at 37 degrees 

oiiiiiEX was carried on*- -f^-^ ««y-«-ees. 

airriea out for ten rounds. The 3 on 
candidate mixture bound to tPA with T 
7 X 10 (-8) M in 150 mM nil f ^" affinity (kd) of 

m 150 mM NaAc plus 50 mM Tris-acetate pH 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



83 

7.5; the affinity of the RNA present after nine rotinds 
of SELEX was about threefold tighter. Nine clones were 
isolated, sequenced, and some of these were tested for 
binding to tPA as pure RNAs. The sequences of the nine 
5 clones obtained at low salt were as follows: 



Neune 


1 


Al 


3 


A2 


1 


A3 


1 


B 


1 


C 


1 


D 


1 


E 


1 



Secnience of random region 
AC6AAACAAAUAAG6AGGAGGAG6GAUU6U 
AGGAGGAGGAGGGAGAGCGgAAAnCAGAmT 
10 A3 1 AGGAGGAGGAGGUAGAGgAPGnAntTAAC&r: 

UAAGCAAGAAUCUACGAUAAAUACGUGAAC 
AGUGAAAGACGACAAOGAAAAACGACCACA 
CCGAGCAUGAGCCUAGUAAGUGGUGGAUA 
UAAUAAGAGAUACGACAGAAUACGACAUAA 
15 

All tested sequences bound at least somewhat 
better than the starting 3 ON candidate mixture. 
However, the A series bound to nitocellulose better in 
the absence of tPA than did the candidate mixture, as 

20 though the shared sequence motif caused retention on 

the nitrocellulose matrix by itself. That motif is 
underlined in the sequences shown above. In other 
SEIiKX experiments AGG repeats have been isolated when 
trying to identify a ligand solution to HIV-l reverse 

25 transcriptase, the human growth hormone receptor 

extracellular domain, and even the R17 coat protein in 
a first walking experiment. When tested, these 
sequences show modest or substantial binding to 
nitrocellulose filters without the target protein being 

30 present. It appears that the AGG repeats may be found 

in hairpin loops. Since SELEX is an iterative process 
in most embodiments, it is not surprising that such 
binding motifs would emerge. 

The existence of nitrocellulose binding motifs 

35 may be avoided by one or more of several obvious 

strategies. RNA may be filtered through the 
nitrocellulose filters prior to SELEX to eliminate such 



SUBSTITUTE SHEEl 



WD 91/19813 



PCr/US91/d4078 



10 



15 



84 

motifs. Alternative matrices may be used in 
alternative rounds of SELEX, e.g., glass fiber filters. 
Alternative partitioning systems may be used, e.g., 
columns, sucrose gradients, etc. it is obvious that 
any given single process will lead to biases in the 
iterative process that will favor motifs that do not 
have increased binding to the target, but are selected 
by the selection process. It is, therefore, important 
to use alternating processes or screening processes to 
eliminate these motifs, it has been shown that the A6G 
repeats, like other motifs isolated as biases that are 
target independent, will tend to emerge most frequently 
when the affinity of the best sequences for the target 
are rather low or when the affinities of the best 
sequences are only sli^tly better than the affinity of 
the starting candidate mixture for the target. 

Isolation of » r.»r.^.ir. ^ ^-. ^ ^ 
mammal ian yeceotot; 
Mammalian receptors often are proteins that 
reside within the cytomplasmic membranes of cells and 
respond to molecules circulating outside of those 
cells. Most receptors are not known to bind to nucleic 
acids. The human growth hormone receptor responds to 
circulating human growth hormone, while the insulin 
receptor responds to circulating insulin. Receptors 
often have a globular portion of the molecule on the 
extracellular side of the membrane, and said globular 
portion specifically binds to the hormone (which is the 
30 natural ligand). Many disease states can be treated 

with nucleic acid ligands that bind to receptors. 

Ligands that bind to a soluble globular domain 
of the human growth hormone receptor (shGHR) are 
identified and purified using the candidate mixture of 
35 Example 4. Again, the binding buffers are free of DTT 

The soluable globular domain of the human growth 
hormone receptor is available from commercial and 



20 



25 



SUBSTITUTE SHEE7 



wo 91/19813 



PCr/US91/04078 



85 

academic sources, having usually been created through 
recombinant: DNA technology applied to the entire gene 
encoding a membrane-boimd receptor protein. SELEX is 
used reiteratively until ligands are found. The 
5 ligands are cloned and sequenced, and binding 

affinities for the soluble receptor are measured. 
Binding affinities are measured for the same ligand for 
other soluble receptors in order to ascertain 
specif ity, even though most receptors do not show 
10 strong protein homologies with the extracellular 

domains of other receptors. The ligands are used to 
measure Inhibition of the normal binding activity of 
shGHR by measuring competitive binding between the 
nucleic acid ligand and the natiiral (hormone) ligand. 

15 

Example 6 ; Isolation of a nucleic acid liaand for a 
mammalian hormone or factor 
Mammalian hormones or factors are proteins, 
e.g., growth hormone, or small molecules (e.g., 
20 epinephrine, thyroid hormone) that circulate within the 

animal, exerting their effects by combining with 
receptors that reside within the cytoplasmic membranes 
of cells. For example, the human growth hormone 
stimulates cells by first interacting with the human 
growth hormone receptor, while insulin stimulates cells 
by fist interacting with the insulin receptor. Many 
growth factors, e.g., granulocyte colony stimulating 
factor (6CSF) , including some that are cell-type 
specific, first interact with receptors on the target 
cells. Hormones and factors, then, are natural ligands 
for some receptors. Hormones and factors are not 
known, usually, to bind to nucleic acids. Many disease 
states, for example, hyperthyroidism, chronic 
hypoglycemia, can be treated with nucleic acid ligands 
that bind to hormones or factors. 

Ligands that bind to human insulin are 
identified purified using the starting material of 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/lS91/(M078 



86 



10 



15 



Example 3. Human insulin is available from commercial 
sources, having usually been created through 
recombinant DNA technology. SBLEX is used 
reiteratively until a ligand is found. The ligands are 
cloned and sequenced, and the binding affinities for 
human insulin are measured. Binding affinities are 
measured for the same ligand for other hormones or 
factors in order to ascertain specificity, even though 
most hormones and factors do not show strong protein 
homologies with human insulin. However, some hormone 
and factor gene families exist, including a small 
family of igf, or insulin-like growth factors. The 
nucleic acid ligands are used to measure inhibition of 
the normal binding activity of human insulin to its 
receptor by measuring competitive binding with the 
insulin receptor and the nucleic acid ligand in the 
presence or absence of human insulin, the natural 
ligand. 

20 IssaSEleJI: fy^payation of c«1,m» ^.^^^^ 

Pollowing the procedures as described in 
Example 9 below, it was shown that the polypeptide 
bradykinin is not retained by nitrocellulose. To 
enable the SELEX process on bradykinin, the protein was 
attached to Activated CH Sepharose 4B (Pharmacia LKB) 
as a support matrix according to standard procedures. 
The resulting matrix was determined to be 2.0 mM • 
bradykinin by ninhydrin assay, see Crestfield et al 
Jt niQl. Chem . vol. 238, pp. 238, pp. 622-627 (1963)- 

Aych. piochem Binrhy^., vol. 67, pp. 10-15 
(1957) . The activated groups remaining on the support 
matrix were blocked with Tris. see Pharmacia, Mfiato 
. Chromatography. Prinpino<, ,n>1 N nt Tm fln, Ljungforetagen 
AB, Uppsala, Sweden (1988) . 

Spin-column separation was used to contact 
solutions of candidate mixtures with beaded matrix m 
a general procedure for performing a selection step for 



25 



30 



SUBSTITUTE SHEEl 



wo 91/19813 PCr/US91/04078 

87 

SELBX, 40 fiJj of a 50:50 slurry of target sepharose in 
reaction buffer is transferred to a 0.5 ml Eppendorf 
tube. The RNA candidate mixture is added with 60 /il* of 
reaction buffer, the reaction mixture is allowed to 
5 equilibrate for 30 minutes at 37^c. A hole is pierced 

in the bottom of the tube, and the tube is placed 
inside a larger Eppendorf tube, both caps removed, and 
the tubes spun (1000 RPM, 10«, 21®C) to separate the 
eluate. The small tube is then transferred to a new 
10 larger tiibe, and the contents washed four times by 

layering with 50 mI- of the selected wash buffer and 
spinning. To conduct binding assays, the tube 
containing the radioactive RNA is transferred to a new 
Eppendorf tixbe and spiui to dryness. 
15 A bulk binding experiment was performed wherein 

a RNA candidate mixture comprised of a 30 nucleic acid 
randomized segment was applied to the bradykinin 
sepharose matrix. Using the spin-column technique, the 
binding of the bulk BON RNA to various matrices was 
20 determined under high salt concentrations to determine 

the best conditions for minimizing background binding 
to the sepharose. Background binding of RNA to 
sepharose was minimized by blocking activiated groups 
on the sepharose with Tris, and using a binding buffer 
25 of 10 mM DEM and 10-20 DM KOAc. At this buffer 

condition, a binding curve of the randomized bulk 
solution of RNA yielded a bulk Kd of about 1.0 x 10"^. 
See Figure 20. The curve was determined by diluting 
the bradykinin sepharose against blocked, activated 
30 sepharose. 

Example 8 : Preparation of candidate mixtures enhanced 
in RNA motif structures . 
In the preferred embodiment, the candidate 
35 mixture to be used in SELEX is comprised of a 

contiguous region of between 20 and 50 randomized 
nucleic acids. The randomized segment is flanked by 



SUBSTITUTE SliEEl 



PCr/lB91/04078 



88 



fixed sequences that enable the amplification of the 
selected nucleic acids. 

in an alternate embodiment, the candidate 
mixtures are created to enhance the percentage of 
nucleic acids in the candidate mixture possessing given 
nuclexc acid motifs. Although two specific examplL 

skxlled in the art would be capable of creating 
equivalent candidate mixtures to achieve the same 
general result. 

in one specific example, shown as Sequence A in 
Fxgure 21, the candidate mixture is prepared so that 
most Of the nucleic acids in the candidate mixture will 
be bxased to form a helical region of between 4 and 8 

Trl ""^ " °* 21 contiguous 

randomized sequences. Both 5- and 3* ends of the 
sequence mixture will contain fixed sequences that are 
essentxal for the amplification of the nucleic acids. 
Ad3acent these functional fixed sequences will be fixed 
sequences chosen to base pair with fixed sequences on 
the alternate side of the randomized region. Going 

" ^ °' sequences, there will 
be 5 dxstxnct regions: i) fixed sequences for 
amplification; 2, fixed sequences for forming a helical 
structure; 3) 20 or 21 randomized nucleic acid 
residues; 4) fixed sequences for forming a helical 
structure with the region 2 sequences; and 5) fixed 
sec^ences for amplification. The a candidate mixture 
Of Fxgure 21 will be enriched in hairpin loop and 
symmetric and asymmetric bulged motifs. i„ a preferred 
embodx.ent, the candidate mixture would contain e^Ii 
amounts Of sequences where the randomized region is 20 
and 21 bases long. 

A second example, shown in Figure 21 as 
sequence B, is designed to enrich the candidate mixture 
xn nuclexc acids held in the psuedoknot motif. i„ thH 
candidate mixture, the fixed amplification segue L 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/(M078 



89 

flank three regions of 12 randomized positions. The 
three randomized regions are separated by two fixed 
regions of four nucleotides, the fixed sequences 
selected to preferably form a four basepair helical 
5 structure. Going from the 5» to the 3« end of the 

sequence, there will be 7 district regions: 1) fixed 
sequences for amplification; 2) 12 randomized 
nucleotides; 3) fixed sequences for forming a helical 
structure; 4) 12 randomized nucleotides; 5) fixed 
10 sequences for forming a helical structure with the 

region 3 nucleotides; 6) 12 randomized nucleotides; and 
7) fixed sequences for amplification. 

In a preferred candidate mixture, the 
engineered helical regions are designed to yield 
15 alternating GC, CG, GC, CG basepairs. This basepair 

motif has been shown to give a particularly stable 
helical structure. 



ExpfflP^e Bulk binding of r andomized RNA sequences 

20 to proteins not known to bind nucleic 

acids. 

Following the general nitrocellulose selection 
procedures as described in Example 1 above for SELEX, a 
group of randomly selected proteins were tested to 

25 determine if they showed any affinity to a bulk 

candidate mixture of RNA sequences. The candidate 
mixture utilized in each experiment consisted of a '40N 
RNA solution (a randomized mixture having a 40 
randomized nucleic acid segment) that was radiolabled 

30 to detect the percentage of binding. The candidate 

mixture was diluted in binding buffer (200 mM KoAc, 50 
mM TrisoAc pH 7.7, 10 mM DTT) and 30 mL was used in a 
60 ML binding reaction. To each reaction was added 20 
ML, 10 ML or 1 Ml* of each protein. Binding buffer was 

35 added to reach a total volume of 60 mL. The reactions 

were incubated at 37**C for 5 minutes and then subjected 
to filter binding. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04078 



90 



15 



The proteins tested were Acetylcholinesterase 
(MW 230,000); N-acetyl-^-D-glucosaminidase (MW 
180,000); Actin (MW 43,000); Alcohol Dehydrogenase 
(240,000); Aldehyde Dehydrogenase (MW 200,000); 
5 Angiotensin (MW 1297); Ascorbate Oxidase (MW 140,000); 

Atrial Natriuretic Factor (MW 3,064); and Bombesin (MW 
1621) . The proteins were purchased from Boehringer 
Ingelheia, and were utilized in the buffer composition 
in which they are sold. 
10 The RNA candidate mixture used in each 

experiment contained 10,726 counts of radiolable, and a 
background binding of about 72 counts was found. The 
results are summarized in Table 9. All proteins tested 
except Acetylcholinesterase, N-acetyl-/9-.D- 
glucosaminidase and Actin were found to yield some bulk 
RNA affinity. Because of the low concentration of N- 
acetyl-p-D-glucosaminidase in solution as purchased, 
the results for that protein are not definitive. In 
addition, if any of the proteins tested do not bind to 
nitrocellulose—which is the case for bradykinin~no 
affinity would be detected in this experiment. Example 
7 above discussing column supported bradykinin 
demonstrates that the failure to show bulk binding in 
this experiment does not mean that bulk binding does 
25 not exist for a given protein. 

Bxapp^e iq : Isolation of RNA Hc^a^ ^ sol.n-4»r, f ,^^,^ 
Growth Far!»r>Y 
Nerve growth factor (NGF) is a protein factor 
that acts through a receptor on the outside surfaces of 
target cells. Antagonists toward growth factors and 
other hormones can act by blocking a receptor or by 
titrating the factor or hormone. An RNA was sought by 
the SELEX process that binds directly to NGF. 

The starting RNAs were prepared exactly as in 
the case of HSV DNA polymerase (Example 11) . 

Two different experiments were done with NGF. 



20 



30 



35 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/US91/04078 



91 

The first was a ten round SELEX using low salt binding 
buffer, 3 minutes at 37 degrees incubation, and then 
filtration and a wash with the same buffer during the 
SELEX. The low salt binding buffer was 50xnM NaCl plus 
5 50 mM Tris-acetate pH 7.5. The second experiment used 

as the binding buffer 200 mM NaCl plus 50 mM Tris- 
acetate pH 7.5, and then after filtration a wash with 
50 mM Tris-acetate pH 7.5; this SELEX experiment went 
through only seven rounds. 

The low salt experiment yielded 36 cloned 
sequences. Fifteen of the clones were nearly identical 
- #»s 2, 3, 4, 5, 6, 8, 11, 13, 19, 22, 28, 33, and 34 
were identical, while #'s 15 and 25 had a single 
difference: 

15 ACAUCGAUGACCGGAAUGCCGCACACAGAG 
-l-A G 
(15) (25) 

A second abiindant sequence, found six times, 

was: 

CCOCAGAGCGCAAGAGUCGAACGAAUACAG {#«s 12, 20, 27, and 31) 



20 



From the high salt SELEX ten clones have been 
sequenced, but eight of them are identical and 
obviously related to the abundant (but minor) second 
25 class from the low salt experiment. The winning 

sequence is: 

CUCAUGGAGCGCAAGACGAAUAGCUACAUA 

Between the two experiments a total of 14 

different sequences were obtained (sequences with one 

30 difference are lumped together in this analysis) ; they 

are listed here, with the similarities overmarked and 

the frequencies noted, ngf .a through ngf .k are from 

the low salt experiment, while hsngf.a through hsngf.c 

are from the high salt experiment: 

3^ xxxxxxxxxxx ####### Frequehcy 

ngf . a ACAUCGAUGACCGGAAUGCCGCACACAGAG 15/36 

XXXXXXXXXXX ####### 
naf b CCUCAGAGCGCAAGAGUCGAACGAAUACAG 6/36 

^ ' $$$$$$$$$$$$$$ $$$$ $55$ 



SUBSTITUTE SHEEl 



PCT/US91/04078 



ngf.f 



xxxxxxxxxxx* # # # # # # 

XXXXXXXXXXX itili^*4H»A 



hsngf .c 



CCAUAGAGGCCACAAGCAASCTiJcCA 
CCUACAAGAAAAGAGGGAAGGAGAAAAAAA 



1/36 

1/36 

1/36 

1/36 

1/36 
1/36 

1/36 



hsngf.a CUCAU^E^^******* 



1/10 
1/10 



Jl?,;i.i"„° T"^. "^-^ i= 

eaaea wxthm the similar sequences i , 

th.t the wi^in, Pla« ""La^ ^ 

was pe.Cera:;r//irra:r "r- 

a Kd o£ about 20 to 3o tol^ k . '""^ *° 

candi^te Mixture The ! ' 

^ouna to have Tl^JTl^Z T"'^'^ "'^ 

protein ana tPA thZ a roK":::,::: " 

Ju« candidate mixture. Thus 



SUBSTITUTE SHEE7 



wo 91/19813 



PCr/US91/04078 



93 

the SELEX derived nucleic acid ligand hsngf.a is a 
selective ligand to N6F. 

Example 11 ; Isolation of a nucleic acid liaand for 
5 HSV-1 DNA polymerase. 

Herpes simplex virus (HSV-1) is a DNA- 
containing vizrus of mammals. like many DNA-* 

containing viruses, encodes its own DNA polymerase. 
10 The HSV-1 DNA polymerase has been purified in two 

forms, which have different cpialities but each of which 
will catalyze DNA replication in vitro . The simple 
form, which is one polypeptide, is purified from cells 
expressing the cloned gene according to Hernandez, T.R. 
15 and Lehman, I.R., J. Biol. Chem. , 265 . 11227-11232 

(1990). The second form of DNA polymerase, a 
heterodimer, is piirified from HSV-1 infected cells 
according to Crute, J.J. and Lehman, I.R., J. Biol. 
Chem., 19266-19270 (1989); the heterodimer 

20 contains one peptide corresponding to the polymerase 

itself and another, UL42, also encoded by HSV-1. 

SELEX was performed on both the single 
polypeptide and the heterodimer. The binding buffer in 
each case was 50 mM potassium acetate plus 50 mM Tris 
25 acetate, pH 7.5, and 1 mH dithiothreitol . Filtration 

to separate bound RNA was done after four minutes of 
incubation at 37 degrees; the filters were washed with 
binding buffer minus dithiothreitol. 

The HNA candidate mixture was transcribed from 
30 DNA as described previously. As is the case in other 

embodiments, the DNA sequence includes a bacteriophage 
T7 RNA polymerase promoter sequence that allows RNA to 
be synthesized according to standard techniques. cDNA 
synthesis during the amplification portion of SELEX is 
35 primed by a DNA of the sequence: 

cDNA primer (PGR primer l) : 5* GCCGGATCCGGGCCTCAT6T6AA 
3' 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



94 



10 



The DNA primers iised to amplify the cDNA in 
that portion of the SELEX cycle include, in one of 
them, the T7 promoter; that PGR primer has the 
sequence: 

5PCR primer 2: 5' CCGAAGCTTAATACGACTCACTATAGGGAGCTCAGAATAAACGCTCAA 
3» 

The initial remdomized DNA included the 
sequence with the T7 promoter, 30 randomized positions, 
and the fixed sequence coii?)lementary to POl primer 1. 
The RNA that is used to begin the first cycle of SELEX 
thus has the sequence: 

pm C BGAB C UCAGMUMACSCUCAA - 30M • UUCBACaUJGAGOCCCGGAUCCOGC 

SELEX was performed for seven rounds, after 
which CDNA was prepared and cloned as described 
15 previously. The series of sequences designated "H" 

were obtained with the simple HSV DNA polymerase as the 
target, while the "U" series was obtained with the 
heterodimeric polymerase that includes the UL42 
polypeptide. 

About 25% of the sequences from the H series 
contain an exact sequence of 12 nucleotides at the 5* 
end of the randomized region (the upper case letters 
are from the randomized region) . In some sequences the 
length between the fixed primers was not exactly 30 
nucleotides, and in one case (H2) a large deletion was 
found within the randomized region. The members of 
this H subset include: 

xxxxxxxxxxxx 
— -cgcucaaUAAGGAGGCCACGGACAACAUGGUACAGCuucgaca-- 
— cgcucaaUAAGGAGGCCACAACAAAIGGAGACAAAuucgaca— 
— cgcucaaUAAGGAGGCCACACACAUAGGDAGACAUGuucgaca-- 



20 



25 



H5: 

30 HIO 



35 



H4: 
H19: 



cgcucaaUAAGGAGGCCACAUACAAAAGGAUGAGUAAAuucgaca— 
H20: 

cgcucaaUAAGGAGGCCACAAAUGCOGGOCCACCGAGAuucgaca— 

H3 8 : —cgcucaaUAGGGAGGGCACGGGAAGGGUGAGDGGAUAuucgaca- 



H2 : — cgcucaaUAAGGAGGCCACAAGuucgaca — 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



95 

Two members of the U series share this primary 
sequence motif: 

U9 : — cgcucaaUAAGGAGGGCCACAGAUGUAAUGGAAACuucgaca — 

U13: 

5 cgcucaaUAAGGAGGCCACAUACAAAAGGAUGAGUAAAAuucgaca — 

The remaining sequences from the H and U series 
show no obvious common sequence; in addition, no 
sequences from the seventh round emerged as winning 
single sequences in either series, suggesting that more 
10 rounds of SELEX will be required to find the best 

ligand family for inhibiting HSV DNA polymerase. 
It appears that the primary sequence. 
— •cgcucaaUAAGGAGGCCAC . • . . 
may be a candidate for an antagonist species, but those 
15 members of the series have yet to be tested as 

inhibitors of ONA synthesis. It appears that the fixed 
sequence just 5* to the UAAGGAGGCCAC must participate 
in the emergence of this subset, or the shared 12 
nucleotides would have been positioned variably within 
20 the remdomized region. 



Example 12; Isolation of a nucleic acid liaand 

for E> coli Ribosomal Protein SI ; 

The E. coli 305 ribosomal protein SI is the 

25 largest of the 21 3 OS proteins. The protein has been 

purified based on its high affinity for 
polypyrimidines , and is thought to bind rather tightly 
to single stranded polynucleotides that are pyrimidine 
rich. It was questioned if the RNA identified as a 

30 ligand solution by SELEX was in any way more 

information rich than a simple single stranded RNA rich 
in pyrimidines. 

The RNAs, DNAs, cDNA primer (PGR primer 1), and 
PGR primer 2 were identical to those used for HSV-1 DNA 

35 polymerase (see. Example 11) . The binding buffer 

contained 100 mM ammonium chloride plus 10 mH magnesium 
chloride plus 2 mM dithiothreitol plus 10 mM Tris- 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/US91/04078 



96 

chloride, pH 7.5. Binding was at room temperature, and 
complexes were once again separated by nitrocellulose 
filtration. The protein was purified according to !• 
Boni et al., European J. Biochem., 121 , 371 (1982). 
5 After 13 SELEX rounds, a set of 25 sequences 

was obtained. More than twenty of those sequences 
contained pseudoknots, and those pseudoknots contain 
elements in common. 

The general structure of pseudoknots can be 
10 diagrammed as; 

STEM la - LOOP 1 - STEM 2a - STEM lb - LOOP 2 - 
STEM 2b f See Figure 31} 

Most of the SI protein ligands contain: 

STEM 1 of 4 to 5 base pairs, with a G just 5» 
15 to LOOP 1 

LOOP 1 of about 3 nucleotides, often ACA 

STEM 2 of 6 to 7 base pairs, stacked directly 
upon STEM 1 

LOOP 2 of 5 to 7 nucleotides, often ending with 

20 GGAAC 

A reasonable interpretation of these data is 
that LOOP 2 is stretched across STEM 1 so as to hold 
that loop rigidly in a form that simplifies and 
enhances the binding of the single strand to the active 
25 site of protein SI. A picture of the consensus 

pseudoknot in two dimensions would look like this: 

-R M G 

G 

30 /^y^ (C/G) A 

^ (U/A) A 
.C C 

N-N' 

35 // / / / A-u 

A-U 

. . G-C 
5 ' — NNNYR (G/C) (A/U) GACAC-gNNNNNNN — 3 ' 

In such figures the base pairs are shown as 

40 lines and dashes, the selections of bases from the 

randomized region are shown in upper case letters, Y is 




SUBSTITUTE SHEE] 



wo 91/19813 



PCr/US91/(M078 



97 

a pyrimidlne, R Is a purine, N- N* means any base pair, 
N means any nucleot:ide, and the lower case letters are 
from the fixed sequence used for PGR amplifications. 

It appears that single stranded polynucleotide 
5 binding proteins and domains within proteins will often 

select, during SELEX, a pseudoknot which presents the 
extended, rigid single strand called LOOP 2 to the 
binding site of the protein in a manner that msoeimizes 
the interactions with that site. Thus, when the HIV-1 

10 RT psueodoknot emerged, it is reasoneO^le to think that 

the single stranded domain LOOP 2 is bound within the 
region of RT that holds the template strand during 
replication. That is, it appears reasoneUble that most 
replication enzymes (ONA polymerase, RNA polymerase, 

15 RNA repl leases, reverse transcriptases) will have a 

domain for holding the template strand that might 
prefer a pseudoknot as the llgand of choice from SELBX. 

Example 13; Isolation of a nucleic aci d liaand to 

20 HTV-j. r^Y pyp^eAn 

The HIV-1 rev protein's RNA-recognltion site 
appears to be complex, and its function is essential to 
the productive infection of an epidemic viral disease. 
See, Olsen et al . . Science, vol. 247, pp. 845-848 

25 (1990) . The SELEX process on this protein was 

performed in order to learn more sdoout the recognition 
element and to Isolate a llgand to the target protein. 

A candidate mixture was created with a 32 
nucleotide long random region as described above in 

30 Example 2. It was found that the rev protein could 

saturably bind the starting candidate mixture with a 
half-maximal binding occuring at about 1 x 10 (-7) M as 
determined by nitocellulose assays. All RNA-protein 
binding reactions were performed in a binding buffer of 

35 200 mM KOAc, 50 mM Tris-HCl pH 7.7, 10 mM 

dithiothreitol . RNA and protein dilutions were mixed 
and stored on ice for 30 minutes then transferred to 37 



SUBSTITUTE SHEE7 



wo 91/19813 



PCrAS91/04078 



98 



10 



degrees for 5 minutes, (in binding assays the reaction 
volume is 60 ul of which 50 ul is assayed; in SELEX 
rounds the reaction volume is loo ul.) Each reaction 
is suctioned through a prewet (with binding buffer) 
nitrocellulose filter and rinsed with 3 mis of binding 
buffer after which it is dried and counted for assays 
or subjected to elution as part of the SELEX protocol. 
Ten rounds of SELEX were performed, using a RNA 
concentration of about 3 x 10 (-5) M. The concentration 
of rev protein was l x 10(-7) in the first round, and 
2.5 X 10 (-8) in all subsequent rounds. The intial 
candidate aiixture was run over a nitrocellulose filter 
to reduce the number of sequences that have a high 
affinity for nitrocellulose. This process was also 
15 repeated after rounds 3, 6, and 9. The cDNA product 

was purified after every third round of selection to 
avoid anomalously sized species idiich will typically 
arise with repeated rounds of SELEX. After 10 rounds 
the sequence in the variable region of the UNA 
population was nonrandom as determined by dideoxy-chain 
termination sequencing. 53 isolates were cloned and 
sequenced. 

Each of the cloned sequences are listed in 
Table 10. All sequences were analysed by the Zucker 
RNA secondary structure prediction program, see, 
zucker. Science, vol. 244, pp. 48-52 (1989); Jaeger 
al^, Proc. Natl. Acad. Sci. USA, vol. 86, pp. 7706-7710 
(1989) . on the basis of common secondary structure all 
sequences have been grouped into three common motifs as 
30 shown in Table ll. Motifs I and II are similiar in 

conformation including a bulged loop closed at each end 
by a helix. This generalized structure has been 
illustrated schematically in Table 12, and the domains 
labeled for easy discussion; that is from 5- to 3- stem 
35 la (which base pairs to the 3 • stem lb). Loop i, stem 

2a, Loop 3, Stem 2b, Loop 2, and Stem lb. The 
sequences which fit in the various domains are listed 



20 



25 



SUBSTITUTE SHEE7 



wo 91/19813 



PCr/US91/04078 



for individual sequences in Table 12* (Note that in 
sequence 3a, the homologous alignment is flipped 180 
degrees so that it is Stem 1 which is closed with a 
loop.) The energies of folding of the RNA molecule 
5 (including the fixed flanking sequences) are shown in 

Table 13. 

The wild-type rev responsive element (RRE) that 
has been determined to be at least minimally involved 
in binding of rev to HIV-l transcripts was also folded 
10 by this program, and is included in Tables 12 and 13, 

The sequences were also searched for related 
subsequences by a procedure based on that described in 
Hertz et al. Comput. Appl* Biosci. , vol.6- pp. 81-92 
(1990). TWO significant patterns were identified. 
15 Each isolate was scored to identify its best match to 

the patterns, the results of which can be seen in Table 
13. The related stibsequences motifs are presented by 
the common seccmdary structiures in similiar 
conformations; that is, the first sequence UUGA6AUACA 
20 is commonly found as Loop 1 plus the 3» terminal CA, 

which pairs with the UG at the 5* end of the second 
information rich sequence UGGACUC (commonly Loop 3) . 
There is also a strong prediction of base-pairing of 
the GAG of sec[uence I to the CUC of sequence II. Motif 
25 II is similiar to Motif I in that the subsequence 

GAUACAG predominates as a loop opposite CU6GACAC with a 
similiar pairing of CA to UG. Motif II differs in the 
size of the loops and some of the sequence particularly 
in the absence of predicted base-pairing across the 
30 loop. One domain of the wild-type RRE closely 

resembles Motif II. Motif III is the least like all 
the other sequences, although it is characterized by 
two bulged U»s adjacent to base-paired GA-UC as in 
Motif I. Unfortunately, further comparisons are 
35 complicated because the folding pattern of Motif III 

involves the 3» fixed sequence region in critical 
secondary structures; because these sequences are 



SUBSTITUTE SHEEI 



wo 91/19813 



PCr/US91/04078 



10 



15 



100 

invariant there is no way to analyse the importance of 
any one of them. The folded sequences of 
representatives of each Motif is shown in Figure 23 
with the folded sequence of the wild-type rre. 

The sequences were further analyzed for their 
affinity to the rev protein. Templates were PCR'd from 
a number of clones from which labeled in vitro 
transcripts were prepared and individually assayed for 
their ability to bind rev protein. These binding 
curves are shown in Figures 24 to 28. Labeled 
transcripts from oligonucleotide templates were also 
synthesized which contain the wild-type RRE discussed 
above, and what is inferred to be the consensus motif 
in a highly stable conformation. To control for 
experimental variations, the best binding sequence, 
isolate 6a, was assayed as a standard in every binding 
experiment. The RHA-protein mixtures were treated as 
described above except that diluted RNA»s were heated 
to 90 degrees for i minute and cooled on ice prior to 
20 mixing. The average Ka for isolate 6a was 8.5 x 10 (-8) 

M, and the results of this experiment are shown in 
Table 13. 

The binding curves of Figure 24 shows that the 
evolved population (P) improved approximately 30fold 
for binding to rev protein relative to the starting 
candidate mixture. The binding of the wild-type rre 
Closely resembles that of the most abundant clone, Ic. 
This experiment also illustrates how sensitive the rev 
binding interaction is to secondary structure, 
isolates 6a and 6b are identical in the regions of high 
information content, but are quite different at the 
level of secondary structure resulting in changes at 
three nucleotide positions. These changes, which 
predict the base-pairing of stem i, lower the affinity 
35 Of 6b by 24fold. Sensitivity to secondary structure 

anomalies is further illustrated by the binding of 
isolate 17 as shown in Figure 25. isolate 17 has the 



25 



30 



SUBSTITUTE SHEE3 



wo 91/19813 



PCT/US91/04078 



101 

maximum information score as shown in Table 12. 
However, there is an extra bulged U at the 5* end of 
Loop 1 as shown in Table 11. This extra U results in 
isolate 17 •s reduced affinity for rev as compared to 
5 other sequences of Motif !• In contrast, single 

nucleotide deletions of Loop 2 sequences, even those 
that diminish the prospect of cross-bxilge base-pairing 
are well tolerated by the rev interaction. 

Another compelling commonality is the 
10 conservation of the sequence ACA opposite U66 where the 

CA pairs with the UG to begin Stem 2. This sequence is 
shared by Motifs I and II as well as by the wild-type 
RRE. Sequences 11 and 12 exhibit a base-pair 
substitution at this position (see Table 12), and 
15 sequence 12 was tested and has reduced affinity 

compared to most of the other Motif I sequences. 

The RNA sequences determined by SELEX to be rev 
ligands may be classified by primary and secondary 
structure* A consensus emerges of an asymmetric bulge 
20 flanked by two helices in which are configured 

specifically conserved single and double stranded 
nucleotides* Although base-pairing across the bulge is 
predicted for many of the sequences isolated (Motif I) , 
it may not be essential or crucial to rev interaction. 
25 Optimal sizes for Loop 1 appear to be 8 (Motif 1} or 6 

(Motif III) where there is an observed penalty for 
sizes of 9 or 3. Optimal sizes for Loop 3 are 5 and 4. 
In addition, the interaction of rev with the various 
domains of these ligands may be additive. Motif II 
30 resembles Moftif I primarily at the junction of Loops 1 

and 3 at Stem 2. Motif III resembles Motif I at the 
junction of Loops 1 and 3 at Stem 1. Consensus 
diagreuns of the Motif I and II nucleic acid solutions 
for HIV-rev are shown in Figures 29 and 30. 
35 The abundance of sequences in the cloned 

population is not strictly correlated with affinity to 
rev protein. It is possible that the concentration of 



SUBSTITUTE SHEEl 



wo 91/19813 



PCnyUS91/04078 



102 



rev protein used throughout the SELEX process was 
sufficient to bind a slgnifigant percentage of all 
these isolates. As a consequence, there My have been 
selection for replicability of cDNA and DNA during pgr 
5 superimposed on a low stringency selection for binding 

to rev. The highly structured nature of these ligands 
and the possible differences in the efficiency of cDMA 
synthesis on these templates reinforces this potential 
replicative bias. Also, there is some mutation that 

10 occurs during the SELEX process. The sequence 6a so 

resembles 6b that they must have a common ancestor. 
This relatively late arrival during the rounds of SELEX 
may explain the paucity of this sequence irrespective 
Of its hie^er affinity to the target, m the same 

15 manner, some of the ligands that have emerged may have 

mutated relatively recently during selection from 
ancestor sequences that exist in the intlal candidate 
mixture but are not represented in the cloned 
population. 

The invention disclosed herein is not limited 
in scope to the embodiments disclosed herein. As 
disclosed, the invention can be applied by those of 
ordinary skill in the art to a large number of nucleic 
acid ligands and targets. Appropriate modifications, 
adaptations and expedients for applying the teachings 
herein in individual cases can be employed and 
understood by those skilled in the art, within the " 
scope of the invention as disclosed and claimed herein. 



20 



25 



30 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



103 

TABLE 2 



la) 5' -taatacgactcact:atagggagccaacaccacaattccaatcaag--3 

(bridging oligo for 5' construction and 5' PGR olxgo) 

lb) 5' -taatacgactcactatagggagcatcagacttttaatctgacaatcaag-3' 
{bridging oligo for 5' construction and 5'PCR oligo) 

2) 5' -atctatgaaagaattttatatctc-3' 

(bridging oligo for 3' ligation) 

3a) 5 ' -gaat tgtggtgttggctccctatagtgagtcgtatta-3 ' 
(template construction oligo) 

3b) 5'-tcagattaaaagtctgatgctccctatagtgagtcgtatta-3' 
(template construction oligo) 

4) 5' -tttcatagatnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncttgattg-3' 
(template construction oligo) 

5) 5' -ccggatccgtttcaatagagatataaaattc-3' 

(3' cloning oligo and template construction oligo) 

6) 5' -gtttcaatagagatataaaattctttcatag-3' 
(3' primer for PGR) 

7 ) 5 ' -ccaaagct t ct aat acgactcact atagggag-3 ' 

(5' PGR primer for cloning and for inhibition assay) 

8 ) 5' -agagatataaaattcttt catagnnnnttttcccgnnnnnnnncggaanncttgattgt- 

cagatt aaaagt c-3 ' 
(random template for SELEX experiment 3) 

9 ) 5 ' -qacgttgtaaaacgacggcc-3' 

(3' PGR and RT extension primer for inhibition assay) 



SUBSTITUTE SHEEl 



wo 91/19813 

PCT/US91/04078 

TRELE 3 104 
Starting rna 

5' -gggagcaucagacuuuuaaucugacaaucaag-{-32 n' s-] - 

-aucuaugaaagaauuuuauaucucuauugaaac-3' 

isolate 



ucaagAAUUCCGUUUUCAGUCGGGAAAAACUGAACAaucu (13, 

ucaagCGUAGGUUAUGAAOGGAGGAGGUAGGGUCGUAaucu (5, 

-f!f!?^??!!S!!!?555^?ffGAACGGGAAAACCGGCaucu (1^ 

iaucu (4) 



1.1 

1.2 

1.3a 
1.3b 
1.3c 
1.3d 
1.3e 

1 . 4 "CaagGGCAUCUGGGAGGGUAAGGGUAAGGUUGUCGG, 
1.6 

1 • 7 ucaa 

-*^^>-~»^^a«^u<jAUAAUGCGGCaucu (2) 

1-8 "caagGAUUAACCGACGCCAACGGGAGAAOGGCAGGGaucu (2) 
1.9a 
1.9b 



-.-wv^v.^ouuuw>ut-Lil,aUCU (4) 

ucaagCCCACGGAOGUCGAAGGCJGGAGGUUGGGCGGCaucu (3, 

ucaagAAGAAGAUUACCCAAGCGCAGGGGAGAAGCGCaucu (2) 
uca^gGAAUCGACCCAAGCCAAAGGGGAUAAOGCGGG 

gGAUUAACCGACGCCAACGGGAGAAOGGCAGGG.ucu (2) 

^fiJfCCGGCGGGAUAUCGGCGaucu ( 1 ) 

-!!!!!;!?????^?f?f^CAUGCACAGCUACACUCaucu a) 

C 

1 . 11 "caagCOCACGGAUGUCGAAGGUGGAGGUUGGGCGGCAuc (i, 

1 . 12 ucaagCAUAGACCGCGUAGGGGGAGGUAGGAGCGGCCaucu 
1 - 13 "CaagCUCUOUCAUAGACCGCGGAGGAGGUOGGGAGaucu 

1.14 "CaagUUCCUAGUAGACUGAGGGUGGGAGUGGUGGAUGucu (l, 

1.15 "caagCCAAUUACUUAUUUCGCCGACUAACCCCAAGAaucu (I, 

1.16 ^caagGAGGCCAAUUCCAUGUAACAAGGUGCAACUAAUaucu (1, 

1.17 ucaagUGCGUAUGAAGAGUAUUUAGOGCAGGCCACGGaucu (1) 

1.18 "CaagUAAUGACCAGAGGCCCAACUGGUAAACGGGCGGucu (1, 

1.19 "c:aagAGACUCCACCUGACGUGUUCAACUAUCUGGCGaucu (l, 
Nucleotides of the fixed regions are shown as lower case letters. 



cu (1, 



SUBSTITUTE SHEEl 



wo 91/19813 PCrAJ^l/04078 

TABLE 4 105 
Pseudoknor Moclf 



1 . 1 ucaagAAUUCCGUUUUCAGUCGGGAAAAACUGAACAaucu (13) 



1.3a ucaagAAUAUCUUCCGAAGCCGAACGGGAAAACCGGCauCU (1) 



2.9 ucaagGUUUCCGAAAGAAAUCGGGAAAACUGucu (1) 



2.4a ucaagUAGAUAUCCGAAGCUCAACGGGAUAAUGAGCaucu (3) 



2.7a ucaagAUAUGAUCCGUAAGAGGACGGGAOAAACCUCAa-CU (3) 
1 . 7 ucaagGAAUCGACCCAAGCCAAAGGGGAUAAUGCGGCaucu ( 2 ) 



2.11 ucaagUCAUAUUACCGUUACUCCUCGGGAUAAAGGAGaucu (1) 



1.18 ucaagUAAUGACCAGAGGCCCAACUGGUAAACGGGCGGucu ( 1 ) 



1.8 ucaagGAUUAACCGACGCCAA-CGGGAGAAUGGCAGGGaucu (2) 



2.1b ucaagAAUAUAUCCGAACUCGA-CGGGAUAACGAGAAGaGCU ( 7 ) 
1 . 6 ucaagAAGAAGAUUACCCAAGCGCA-GGGGAGAAGCGCaucu (2) 



2.10 ucaagUAAAUGAGUCCGUAGGAGG-CGGGAOAUCUCCAAcu (1) 

1.9b ucaagAGAGUAUCA^bGUGCCGG — CGGGAUAUCGGCGaucu {!) 

2.12 ucaagAAOAAUC^GJVCUCG CSGGAUAACGAGAAGAGcu (l) 

2, . 1 Ob ucaagUUCGAACAAG — CGGAACAUGCACAGCCACACUCaUCU ( 1 ) 



2.3a caagOUAAACAUAAUCCGOGAUCUUOCACACGGGAGaucuaugaaaga (7) 



SUBSTITUTE SKEEl 



wo 91/19813 

PCr/US91/04078 

OSVBLE 4 OON'T ■ >^ 106 

2.2b "«C«gL'Ai^AGGU^AUAAA-^GAGAAfeGUGUGa-cu (1, 
2.5a -caagACJAGUAE^UUCUUGAUducSiiACAAAUGSiriu (3, 
2.6b ucaagUGAAACUUAAewOUAUCAUAGAUci^^ (2) 

Nicrocellulose recension roocif 

1.2 ucaaguwuAGG TOAtfG^GGAGGAGGUAGGG ijcdUAaucuaug (5, 

1.4 aucugacaau^aagGGCAUCPGGgAGGGOAAGGGUA AZlGUUGOCGgauc u (4, 

1.5 "CaagCCCACGGAOGUCGAAGGUGGAGGUOGGGCGGCaucu (3) 

1.11 UcaagCUCACGGAUGUCGAAGGOGGAGGUOGGGCGGCASS (i) 

1.12 ucaagcAUAGkcCGCGOAGGGGGAGGUAGGAGCGGCCatfcuaUg (i) 

1.13 ucaagCucuu ucAUAy GCGGAGGAGGOPGGGAG aticuaugaaaga (i, 

1.14 ucaagUUCCUAGUAGACUGAGGGOGGGAGUGGaGGAofeaS' (1) 

Secondary structures as predicted hv t-t,^ -7 u 

overXined arrows which highXigh ite'r^^ ^'^'^ 

base-pairing. "verted repeats indicative of 



SUBSTITUTE SHEET 



wo 91/19813 PCr/US91/04078 

TABLE 5 107 



ivi (O ^ to fo to M fsJ isJ fO ^ fO NJ lO 

CP CT >-*Ci>Of>ocrocr5i>a>0 til ft> 
CT 


Clone 


|-» 


Freq. 


> C C G 
GGCCGCGOCCCCG 

onnooooGnononnnoooo 
ononooonoooon;p»noonn 

C G 


Stem 1(a) 




Loop 1 


0 

> P G GO 
GO> OOO G C 

o >OGG n r> o o o a o o o c 
nnopooo GO>GGor>>>?*G 
GOO>on>oonoooonnnao 


Stem 2(a) 




> 


Loop2 




o 

G !> 

>tn(nooooooooooooonon 

<)OOOOOOOOOOOOGOOOOO 
DOOOOOOOOOOOOOOOOOO 

o>ooooooooooo 


Stem 1(b) 




^ c) cy a 

S O O > O O 
Q G c 

o o a» 


Loop 3 


ootnoi G>?»coGCOGon>riC c 

>0(nC OOGOOOOOOOOOOOiO 
OOOOOOa >OG>>OOGGC:o* 

o cn>^ oonoonooo 
oiQc noo > o» 
coo > ^ 


Stem 2(b) 



SUBSTITUTE SHEET 



wo 91/19813 „^ „ 

PCrAB91/04078 

TABLE 6 

1 08 

Starting RNA 

5' -gggagccaacaccacaauuccaaucaag-(-32 n' s-] - 

-aucuaugaaagaauuuuauaucucuauugaaac-3' 

isolate 

2.1a ucaag AAOAOA UCCGAACUCGACGGGAOAACGAGAA Gaucu (3) 

2.1b G__ 

2.1c CA-- G- (1) 

2. Id C ■ G— (1) 

2.ie G G~ (1) 

^— 

2.ig _ 

2.1h A ; G — (1> 

2.li GU Ill 

lii IZIZ : A— G— (1) 

^.IK C G— (1) 

2.2a ucaagOACCOAGGUGAUAAAAGGGAGAACACGUGA acu (1) 
2.2b ^OG (13) 

i'i^ ^ — ^ — <2) 

2.2cl G (1) 

2.3a ucaagUUAAACAUAAUCCGUGAUCOTJUCACACGGGAGaucu (7) 

2.3b C~ (1) 

2.3c — jj^j 

2.4a ucaagUA GAOADCCGAAGCUCAACGGGAUAAUGAGCaucu (3) 

2.4b C-AAU (1) 

2.4c. G (1) 

2.4d A _ 

2.4e U— AU (ij 

2.5a ucaagAUAGaAOCCGUUCOOGAUCADCGGGACAAADGaucu (3) 

2 . 5b C 1 1 \ 

2.5c. U III 

2.5d (1, 

2.6a ucaagUGAA CUUAACCGUUAUCAUAGAUCGGGACAAa cu (1) 

2.6b A u__ (2) 

2.6c . u — ^j^j 

2.6d ^A U u — (1) 

2.7a ucaagAUAUG AOCCGOAAGAGGACGGGAUAAACCUCAacu (3) 
2.7b U G (1) 

2.8 ucaagGGGUAUUGAGAUAUUCCGAUGUCCUAUGCUGUaCcu (2) 

2.9 ucaagGUUUCCGAAAGAAAUCGGGAAAACUGucu (1) 

2.10 ucaagUAAAUGAGUCCGUAGGAGGCGGGAUAUCUCCAACU (1) 

2.11 ucaagUCAUAOUACCGUUACOCCUCGGGAUAAAGGAGaucu ( 1 ) 



SUBSTITUTE SHEET 



wo 91/19813 PCr/l»91/04078 

TABLE 6 CON'T 109 

2.12 ucaagAAUAAOCCGACOCGCGGGAUAACGAGAAGAGcu (1) 

2.13 ucaagGAOAAGUGCAGGAAOAUCAAOGAGGCAUCCAAaCcu (1) 

2.14 ucaagAUGAGAUAAAGUACCAAUCGAACCUAUCUAAUACGAcu ( 1 ) 

2.15 ucaagACCCAOOOAUOGCUACAAUAAUCCUOGACCUCaucu ( 1 ) 

2.16 uc aagUAAUACGAUAUACUAAUGAAGCCUAAUCUCGaucu ( 1 ) 

2.17 ucaagAACGAOCAUCGAUAUCUCUUCCGAUCCGUUUGucu ( 1 ) 

2.18 ucaagACGAUAGAACAAOCAOCUCCUACGACGAUGCAcu (1) 

2.19 ucaagAUAAUCAUGCAGGAOCAUUGAUCUCUUGUGCOaucu (1) 

2.20 uc aagAGUGAAGAUGUAAGUGCUOAOCUCUOGGGACACaucu ( 1 ) 

2.21 ucaagCAACAOUCUAOCAAGUAAAGOCACAUGAUaucu (1) 

2.22 ucaagGAUGOAUUACGAUOACOCUAUACUGCCUGCaucu ( 1 ) 

2.23 ucaagGGAUGAAAAUAGUOCCUAGOCOCAUOACGACCAcu (1) 

2.24 ucaagOAGUGOGAOAAOGAAOGGGOUUAOCGUAUGUGGCcu (1) 
1 . 1 ucaagAAUUCCGUUUOCAGUCGGGAAAAACUGAACAaucu (17) 



SUBSTITUTE SHEET 



wo 91/19813 ^ 

PCr/US91/(M078 

TABLE 7 

Starting RNA 110 

5'«gggagcaucagacuuuuaaucugacaaucaagNNttccgNNNNNNNNcgggaaaaNNNN 

-cuaugaaagaauuuuauaucucuauugaaac 

isolate 

3-2 tcaagTAttccgAAGCTCAAcgggaaaaTGAGcta 

3-3 tcaagTAttccgAAGCTTGAcgggaaaaTAAGcta 

3-6 tcaagGAttccgAAGTTCAAcgggaaaaTGAActa 

3-7 tcaagAGttccgAAGGTTAAcgggaaaaTGACcta 

3-25 tcaagGAttccgAAGTGTAAcgggaaaaTGCActa 

3-50 tcaagTAttccgAGGTGCCAcgggaaaaGGCActa 

3-22 tcaagTAttccgAAGGGTAAcgggaaaafGCCcta 

3-8 tcaagTAttccgAAGTACAAcgggaaaaCGTActa 

3-13 tcaagGAttccgAAGTGTAAcgggaaaaCGCActa 

3-23 tcaagGAttccgAAGCATAAcgggaaaaCATGcta 

3-43 tcaggGAttccgAAGTGTAAcgggaaaaAGCActa 

3-45 tcaagTAttccgAGGTGTGAcgggaaaaGACActa 

3-21 tcaagTAttccgAAGGGTAAcgggaaaaTGACcta 

3-9 tcaagTGttccgAGAGGCAAcgggaaaaGAGCcta 

3-37 tcaagTAttccgAAGGTGAAcgggaaaaTACAZta 

3-56 tcaagAGttccgAAAGTCGAcgggaaaaTAGActa 

3-58 tcaagATttccgAGAGACAAcgggaaaaGAGTcta 

3-39 tcaagATttccgATGTGCAAcgggaaaaTGCActa 

3-33 tcaagTAttccgACGTAACAcgggaaaaGTTActa 

3-4 6 tcaagATttccgACGCACAAcgggaaaaTGTGcta 

3-52 tcaagTAttccgATGTCTAAcgggaaaaTAGGcta 



SUBSTrrUTE SHEET 



wo 91/19813 PCr/US91/04078 



TABLE 7 CON'T 111 



3-16 tcaagGGttccgATGCCCAAcgggaaaaGGGGcta 



3-34 tcaagAAttccgACGACGAAcgggaaaaACGTcta 



3-35 l:caagTAttccgATGTACAAcgggaaaaAGTAcl:a 



3-60 tccagCGttccgTAAGTGGAcgggaaaaACCActa 



3-27 tcaagAGttccgTAAGGCCAcgggaaaaAGGTcta 



3-15 ticaagGAttccgAAAGGTAAcgggaaaaATGCcta 



3-18 tcaagAAttccgCTAGCCCAcgggaaaaGGGCcta (2) 



3-31 tcaagAAtt-cgTTAGTGTAcgggaaaaAACActa 



3-26 tcaagCGttccgATGGCTAAcgggaaaaATAGcta 



3-32 HcaagGAttccgTTTGTGCAcgggaaaaGGCActa 

* * 



3-54 tcaagAA-tccgTTTGCACAcgggaaaaCGTGcta 

3-41 tcaggAA-tccgAGAAGCTAcgggaaaaAGCGActa 

3-29 t caagATtt ccgAGGTCCGAcgggaaaaTGGTct a 

3-20 rcaagTAttccgAAGGAAAAcgggaaaaCCACcta 



3-36 tcaagTGttccgAAGGAAAAcgggaaaaCCACcta 



3-28 tcaagAAttccgTAAGGGGTcgggaaaaACCctau 



3-48 tcaagGAttccgTATGTCCTcgggaaaaAGGActa 



3-59 .tcaagAGttccgAAAGGTAAcgggaaaaTTACcta 



3-12 tcaagTAttccgATAGTCAAcgggaaaaGCGActa 



3-30 tcaagTAttccgAGGTGTTAcgggaaaaCACGcta 



3-11 tcaagAAttccgTATGTGATcgggaaaaACCActa 



SUBSTITUTE SHEET 



wo 91/19813 

PCr/US91/04078 

TABLE 7 CON'T 

3-17 tcaagGAttccgATGTACAAcgggaaaaCTGTcta 
3-24 tcaagATttccgAAGGATAAcgggaaaaACCGActa 
3-51 tcaagAAttccgAAGCGTAAcgggaaaaCATAEta 



SUBSTITUTE SHEEl 



WQ 91/19813 



PCr/US91/04(l78 



TABLE 8 113 
XwplM Conacructlon: 

CM M^CA ACACC ACAM, UCCAX OCiU^ -1^^' AUCOA OCAAX CAAUU UOAUX UCUCO AUUGA .AC 



CiOA« 



a2n luiuiom Bagion 



CXonea wiUi AUCA Xoops 



I 
2 
3 
4 
S 
6 
7 
d 
9 

10 
IX 
X2 

X3 
14 
l& 
16 
X7 

xa 

19 

20 
21 

22 
23 
24 
25 
26 
27 
20 . 



CAC 
AU 
CGAAU 
UGGAG 
UCA 
CUCA 
GCAUU 
G 
A 
C 

CGA 
UCCA 



AGAUA 
AUAAG 
AACUG 
UAUAA 
CAGAU 
GAtlAU 
AAUAU 
GGAGA 
AAUUA 
GGAGA 
AUACU 



UCACU 
UAAU6 

cuutic 

ACCUU 
AGCUC 
AUGAC 
GUCUG 
UUCUU 
UCUUC 
UUCUU 
UUCUU 
GUUAG 



UCUGU 
CAUGC 
GUCGA 
UAUGG 
AUAGG 
AGAGU 
CAUGA 
AGUAC 
GGAAU 
ACUAC 
UCGAU 
UAGUU 



UCACC 
GCACC 
UCACC 
UCACC 
ACACC 
CCACC 
UCACC 
UCACC 
GCACC 
UCACC 
GCACC 
GCACC 



AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 
AUCA 



GGGGA 

GCGC6 U 

GGG 

GGG 

GGG 

GGG 

GGG 

GGGGG CA 
GGGCA UGG 
GGGGG CA 

GGGC 



CUAU AGAUA CUUCU 
GGAU AUGAU CUUAU 

uUG UCUUU CAUGU 
AGAGC UAGUU CUUGU 

ACG AGAUU UAUUU 



ACUGA UCACO AUpA CGGG 
GGUAU GCACQ AUCA CGGC 
AGUAA GCACG AUCA CGGC6 
UUAA6 ACACO AUCA CG6 
AGAIKS UCACQ AUCA CGGGC 



UAAU UGAUA CUUOC ACAG6 AUCA CCCUC CUM 
M AGCAC UCAOU AGAOQ AUCA CCCUA GUOCG Q 



CUAUG 
AU CUAUG 
AU CUAUG 
AU CUAUG 
U CUAUG 
AU CUAUG 
AU CUAUG 
CUAUG 
CUAUG 
CUAUG 
U CUAUG 
AU CUAUG 



GAGAU AUCAU AAUUC AUUGU UGAGC AUCA CM 
GAMU AMw«. ^^^^ AGAGC AUCA CCCUA 



UACAU UGCGU GGC 



GAGA UCAAU AGUAA GCACC AUCA MCCU GG 
UGAC AUAUC UCUAU AGUGU GCACC AUCA C« 

A UCAGA UAGAU CAUGC UCACC AUCA ^GC 
ACAC UAUUC UACAU GAUUU CCAUC AUCU GGGCC 
0;^UU AAUUC GUCUU JKJ^ 

CA CUAAC AUAGC AUCA OCAUC UUCUU CCCOC C 



AAAGA 
AAAGA 
AAAGA 



AAAGA 
AAAGA 
AAAGA 
AAAGA 
AAAGA 
AAAGA 
AAAGA 
AAAGA 



U CUAUG AAAGA 
AU CUAUG AAAGA 
A CUAUCAAAAGA 

U CUAUG AAAGA 
AC CUAUG AAAGA 

AU CUAUG AAAGA 
U CUAUG AAAGA 

AU CUAUG AAAGA 
A CUAUG AAAGA 

CUAUG AAAGA 
AU CUAUG AAAGA 

CUAUG AAAGA 
UAUG AAAGA 
A CUAUG AAAGA 
AU CUAUG AAAGA 
AU CUAUG AAAGA 



-13.0 
-19.0 
-17.5 
-13.3 
-13. o 
-XO.O 
-12.6 
-12.6 
-10.9 
-10. J 
-17.6 
-11.3 

- 9.7 
-X7.5 
-10.5 
-12.6 

- 7.6 

-10. a 

-15.0 

-12.6 
-12. 9 

-14.6 
-15.3 
-11.3 
- 9.3 
-13.3 
-XX. 4 
-X4.6 



29 
30 
31 
32 

33 
34 
35 

36 



CXOMS Mich ANCA loop* 

CCCUU AAUUU GGAUU AUAGA UCA^ AACA COG 
GAGA UGUUU AGUAC UUCAG CCACC AACA ^ 
OUCA UACUC «CUUU GUn^ CO^ 

CAACA GAGAU GAUAU CAGGA UGAG6 ACCA CCC 
AGAUA UAAUU CUCCU CUUGA UGAGC ACCA CCC 

icAUA UGAGA UAGUU GCACC ACCA GGGUG 



AUA UAGGA GAUAU UGUAO UCACG AOCA CGGG 



AC CUAUG AAAGA - 7.9 

U CUAUG AAAGA -14.2 

AU CUAUG AAAGA - 9.4 

A CUAUG AAACA. - 9.5 

AU CUAUG A GGA -XX. 8 

AU CUAUG AAAGA -Xd.5 

AU CUAUG AAAiGA -X6.d 

CUAUG AAAGA -X2.5 



37 
38 



CXOMS Mith no ANCA Xoop 

UGCGUCACUUAUUGGAACUCUOGGUOGC 
CUG6AGGAGAUUCUGUAAUCCCUUGAACUCC 



A CUAUG AAAGA 
A CUAUG AAAGA 



-17.7 
- 9.7 



SUBSTITUTE SHEET 



wo 91/19813 

TABLE 9 



114 



PCT/US91/CM078 



3 
O 
CD 



o o o 



o o o 



u> 

O* CM 



o o o 5: ^• 



^ d 

« T« IS. 



CO 
CM 



o c 



>- 

£ 
< 

o 



CO CO 

b bo 



o> o> ^ 



° b b 

T~ «*• 



X X X 

^3 

r*** CO CO 



XXX 

o lo in 
» ^ ^ 



CO 

4> 2 b 

«J ^ rs. 
r- a> a> 



in (o 
b b b 

X X X 

^ o q 
is! 



*0 lO 40 



H X K 

^ ^ 
CM W T* 



CO CO ^ 

o b b 

«> « « 

CM W ^• 



CO 

Z 3 CO 

^ 0)2: 
*S X) Ci) 



<0 00 ^ 



s ^ 

00 CQ CO 



S 5? 
rv u) 



o 

CO CO 

^ ^ o 
^ a> 



^ CO 

» IS* CO 

2> o ^ 

00 CO 



00 ^ o 
CM fs- 



CO 
CQ 



z 
o 

IT 

a 



Q> 



CO 

c 



O 



E 

CO 
CO 

o 

o 



o 
o 

€0 



o 

< 



o 

CO 
CD 

c 

Q> 

? 

I 

Q> 
O 

o 
-c 
o 

< 



I 

O 

t 

CD 

a 
< 



i 



c 

c 
< 



SUBSTITUTE SHEET 



wo 91/19813 



PCrA^l/04078 



TABLE 9 CON'T 115 



S m CM ^ CM 0> 

3 iO CM ^ ^▼•00 ^ . . 

O CMt-CO ^ ^ CO T-COv 

ffi 



in in <D 
• It 
o o o 



fiC 

3 X X K 

O *^ CNJ CM 

S CM T» 



*^ iO CD 
• • • 

o o o 



^ ^ in 
t • • 
o o o 



XXX 

in in 
v-^ iri iri 



XXX 

o o 
• ■ • 

CM ^ 



CO 
CM 

o 



CO 



^ CO 

CO o o 

CO CO 

CM T- CO 



GO CO CO 

in T- 
^ 

^ ^ 



00 

O CD 

in in 
(D ^ 



z 

Ui 
O 



o> 
to 
(Q 

ID 

*X 

o 

(0 

o 
o 

< 



o 
cd 



o 

0> 

• Ml 

CO 

Z 

!s 

< 



lA 
0> 
X> 

E 
o 
a 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



TABLE 10 

sequence 
number 

la 
lb 
Ic 
Id 
le 



3a 
3b 

4a 
4b 
4c 



116 



no. of 
isolates 



6a 
6b 

7a 
7b 

8 

9a 
9b 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 





tcaagCTTGAGATACAGATTTCTCSATTCTGGCTCGCTatct (S) 

!f!!f^!?!?f!2f?!ATCAAACGACCTTGAGACACatCt (4) 

(3) 
(1) 
(1) 

tcaagAAGCCTTGAGATACACTATATAGTGGACCGGCatct (3) 

~!!j?!!f?!!2^^CACGTTTGTGGACTCTGT-atct (2) 

Q^^g 

-f!!f^!Sfi?f?!^?5^^'2ACAATACTGGACAC6C-atct (2) 



tcaagGGGACTCTTTTCAATGATCCTTTAACCAGTCGatct (2) 

!~!???!?f?f?J!S!?*'^^''«^CCTTAACCGGITGatct (1) 

tcaagCACGCATGACACAGATAAACTGGACTACGTGCatct (1) 

tcaagACACCTTGAGGTACTCTTAACAGGCTCGGTGatCt (i) 

toaagTTGAGATACCTGAACTTGGGACTCCTTGGTTGatct (1) 

tcaagGGATCTTGAGATACACACGAATGAGTGGACTCGatct (1, 

tcaagATCGAATTGAGAAACACTAACTGGCCTCTTTGatct (1) 

tcaagGCAGCAGATACAGGATATACTGGACACTGCCGatct (1) 

tcaagGGATATAACGAGTGATCCAGGTAACTCTGTTGatct (1) 

tcaagGTGGATTTGAGATACACGGAAGTGGACTCTCCatct (1) 

tcaagAGATAATACAATGATCCTGCTCACTACAGTTGatCt (1) 

tcaagGGAGGTATACAGAATGATCCGGTTGCTCGTTGatCt (1) 

tcaagAGAAGAATAGTTGAAACAGATCAAACCTGGACatct (1) 



SUBSTITUTE SHEEl 



wo 91/19813 PCrAJS91/04078 

TABLE 11 ^ ^ 7 



MOTIF I 



aagGGAUCUUGAGAUACACACGA AUGAGUGGACUCGaucuaugaaa 13 (1) 



agGUGGAUUUGAGAOACACGG ^AAGUGGACUCUCCaucuauga 17 (1) 



agGGUGCAUUGAGAAACACGU--- —UUGUGGACUCUGUaucuauga 6a (2) 



-CGACCUUGAGACACaucu-3' 5' -agAUGGACOCGGOACJCAAA- 3a (4) 



agAUCGAAOUGAGAAACACUA ACUGGCCUCUOUGaucuaug 14 (1) 



caaucaagOgGAGAUACCUGAA— CUUGGGACUCCUUGGUUGAUc 12 (1) 



aagAUGGCOGGAGAOACAAAAC — - — DAUUUGG^CUCGCCaucuauga 4a (3) 



aagAAGCCaUGAGAUACACUAU-- ^AOAGUGGAC^CGGCaucuauga 5 { 3 ) 



aaucaagCUUGAGAUACAGAUU-UCUGAUUCUGG-CUCGCUaucuauga 2 ( 5 ) 



aagACACCUUGAGGUACUCUU AACAGG^CUCGGUGaucuaug 11 (1) 

MOTIF II 



ucaagGAGAUGAAGAUACAGCUCUA-^^GAUGCUGGACACaucuauga Ic (9) 



aaucaagAGCGAAGAUACAGAAGACAA—UACDGGACACGCaucuau 7a (2) 



aaucaagGCAGCAGAUACAGGAU AUACUGGACACDGCCGAUc 15 (1) 



gAGAAGAAaAGUUGAAACAGAUC~«-*-AAACCUGGACaucuaugaaa 20 (1) 



aucaagCACGCAUGACACAGAOA AACUGGACDACGUGCAOc 10 (1) 



SUBSTITUTE SHEEl 



wo 91/19813 

PCr/US91/04078 

TABLE 11 (CON'T) iio 
MOTIF III 



'^"*^2HgS*«cuaugaaagaauuui»afe5i:5;rau X8 (i, 

ucaagAAGAGACAOUCGAAUGAOOCC AACCGGUDGlTnT ^ Z 

^^<=<5GOOGaucuaugaaagaauuuuauat5S5J,u 9a (1) 

00 **«^5SS*«"»«9aaagaaSSiniiuaucucuau 8 (2) * 



ucaagGGAGOUAUACAG^^,^,,^^^^^^^^^^^^^ 



--caagGGAUAUJ^CGAGOGAU^ -.-^ . 

""cuaugaaagaauttuuauaucucSau 16 



U) 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



TABLE 12 



Stem 1(b) 


9 9 

0 9 U O 
DD O. O O O O OO 
D U 3 U 1 U 1 O o 
309 m DfOSSiOO oou 
oom o O(0imuiO9OO3 

DOOE^DOOUOa 300 flJDOO 

fonooououoomoo 30<o 
OUSO^OOOODUUUUUUO 


Loop 3 


yuuooowuouuSoooSo 


Stem 2(b) 




Loop 2 


ACGAAUGA 
UGAACUU 

UU 
-OPEN- 
CUAAC 
UUUCUGAU 
CUUAAC 
CUA 

GGAA 
UCUAGAU 
AAGACAAUA 
GAUAUA 
AUCAAAC 
AUAAA 
-OPEN- 
UUCGG 


i 

1 <v 
1 *^ 

1 o 
1 


CAC 
CC 

CACG 
Caucu 

CAGA 
CU 

CAAA 

CACUAU 

CAC 

CAGC 

CAG 

CAG 

CAG 

CAG 

CA 

CACGGC 


Loopl 


UUGAGAUA 
UUCACAUA 
UUCACAAA 
UUCACACA 
UUCACAAA 
UUCACAUA 
UUCACGUA 
UGCACAUA 
UUGAGAUA 
UUUCACAUA 
AAGAUA 
AAGAUA 
CAGAUA 
UUGAAA 
CAUGACA 
ACCGUA 
UUCACAUA 


Stem 1(a) 


GGAUC 
caaucaag 
gGGUGCA 
ACC 
agAUCGAA 
agC 
CACC 
agAUGGC 
agA-A-GCC 
agGUGGA 
AGAU6 
agAGCG 
gGCAG 
AUAG 
gCACG 
GACGCUG 
GGGACCC 


1 


r-tiHCN'Wi-linr-lfOrOi-iaiJNr-ti-lr-t I | 1 


O 

c 
o 

o 


tn tsi n to f f-(fo i^ummoot^n 



SUBSTITUTE SHEET 



wo 91/19813 

TABI,E 13 



120 



PCr/US91/04078 



c 
o 



o 



S S c 

i 8 S 



'11 



3 e 

c5"S o 
< S g 



o 

o 




» I • I I I I 



I ' I I I I 



• » I I I I 



• • • I 7777 • 7 • • • ' J52 • • 



I I 



CM 



i t I I I 



z c o 



JjJ CD H o CM 



Of 



3 

g 

< 



II 



o 

3 

i 

3 



u 
u 
c 



— a 

8 

I § 
ST 8* 



SUBSTITUTE SHEET 



wo 91/19813 



121 



PCTAJS91/04a78 



SeiecUon. A simple kinetic mechanism for reversible protein-RNA complex fonnauon in 
a well*mixed solution is wiiiten as follows: 

where [PJ\ is the free protein concentration, [RNAfii is the free RNA species-/ concentration, 
lF:RNAi] is the protein-RNA species-i complex concentration, k^; is the rate constant for 
association of free protein and free RNA species-i, is the rate constant for dissociation of 
protein-RNA specics-i complexes, and /i is the number of RNA sequences with a unique set of rate 
constants. Alternative mechanisms, including multiple binding sites or cooperativity, could be 
considered in subsequent treatments with qipropciate extensions of this simple scheme. 

For any system represented by the above scheme, the fundamental cheoiical-kinetic or 
mass-action equations describing the change in concentration of each protein«RNA species-/ 
complex as afunctim of time are: 

(2) ^f^-y^^^l = k.MPJ]^mAfd'k4-[FJWAii. i^L...n. 

where [/y], [RN^H^ and lP:RNAi] are the concentradons of free protein, free RNA species-/, and 
procein-RNA spedes*/ complex at time /. 

The free protein concentration is the difference between the toud protein coneenimtion and 
the concentration of all protein-RNA complexes ([P] £ lP:RNAki); likewise, the free RNA 
species-i concentration is the difference between the total RNA species-i concentration and the 

proicin-RNA species-/ complex concentration {[RNAi] - lP:RNAi]): 

^ APPENDIX 



SUBSTITUTE SHEET 



wo 91/19813 

PCr/US91/04078 

122 

(3) - ^^^^^--^A'^yaRNA^-iP^^^^ 



1^1, ... n. 



•nKSe dy^unic eqoatfons can be i«d for eillier loKdc or «,.uabnum walysls. -n,. cominuous 

diffe,c»,ialfe™i.«uawhe«v.rte™c««eofe«b process l.tar^„«v.,o 

to proems. 0, to oU«r words. E,. (3) U for dcscripuoa of . pool of RNA wi* .e«m 

"-l«U«,cp.cscad.gc»ch«nu,„.«,of,.«,co«««s. Whc.c«rU>c„Uo^^^^ „ 
J«.f.w™olccu.«o,d,.bcs..«^,R^Up«..,.^^^^ 

dc.cr„„„U»co».,io„s.M^v..«^,ik.UWo,r.c^^ 

^ ^■"•"'^•^=^^i.c««c«»io.of.«.,,„«to.RNAspec^^ 

<4) ((''l-,S^f.«ArAjj.a«A-Ad-(,..;,AfAd)-if.,.„.^.0. ,„_^^ 

^,n^s«dcfl.cdi.H,.<,..„dwi*^,,.to.d.«,««^.^ 
protcin-RNA spccies-i complex s iL</itvi). 

When only one RNA species is ccMwidered (U « - n -„ • . 
cduiUbriu. concenc^cion of proceio-RNA cocn^c^ " ' ^ 

quadnuiccuadon: ' " '""'"^ 

(5) t^-W-a/']M/^ArA,].i..,).i^,^^,,,j^^ 
which has cwo real roots, one physicaUy leaUzable: 

(6) [/»:ieAMy]= 2.f?l.f/gA^4,1 

([/>] + IRNAj] + iTrf/) ♦ V a/*] + [RNAjl + iC.;)^ - 4 . [/>] . l/j^^.j " 



wo 91/19813 



PCr/US91/04078 



123 



Of course there are numerous classical approximations for equilibrium or quasi-steady-stute 
concentnidons of complexes, like that in the Michaelis-Menten fomtialismp but none give sufficient 
accuracy over the range of total RNA and proiein concentrations used in SELEX. (For revealing 
discussions of some pitfalls and limitations of classical approximation see Savageau, 1991; Straus 
& Goldstein, 1943; Webb. 1963.) Although analytical solution of the quadratic equation for 
simple reversible association of a single RNA species with a single binding site oh the protein is 
accurate over all RNA and protein concentrations used in SELEX. and although the bound 
concentrations of two competing species can be calculated by analytical solution of a cubic 
equation, iterative numerical methods are required to calculate equilibrium concentrations pf 
protein-RNA complexes whenever three or more competing RNA species axe considered. 

We have developed a computer program to solve for the equilibrium concentration of each 
protein-RNA species-i complex, [PcRNAH^ given any total protein concentrauon, [P], any 
distribution of RNA species-i concentrations, [RNAH, and any distribudon of equilibrium 
dissociation constants, Kdi. The Jacobian matrix (e.g., see Leunberger, 1973) for implicit solution 
of Eq. (4) by Newton's method (e.g., see Leunberger, 1973; Press €it aL. 1988) is calculated 
using the following formula: 



where Oij is the element in row-j, column-y of the Jacobian maorix, with » 1 and ^ » 0 for iVy. 



(7) F»r' 





.^„»CT1TUTE SHEET 



PCr/US91/IM078 ' 

124 

(..g..«.Uu.,b«,cr.l973:P».«<,98«.toto«...h..^„^^^^^^ 
P»««.RNA sp«i«.i con^ U>:aN*i. By *c buUc AT. for U« .ou. RNA pool 
""ttciMnilioa of pradn in all praetaJmA coinplexK «^ 

(8) (^:£M4] . ^-fl-WW/l ) 

ihciotalRNA pool, and<^rf> is the bulk cauilihri..^^- • - 

..... ^^^"""^^""^nconaani for jhe total RNAdooI 

calculated using ihc following fonaula: «aiKNApool. 



(9) 



fo„»*= "°"""^*««-'"^«-'»n»te^U«foUowtoj 
Solutions for the values of thai satisfy Eo M^«„K. ^ . 

K . . ^^^•^^^""*»««^»'»«» to a high level of accuracy 

.y.«™...^„ofN.w™...^^.^.„, 

2j«.U.^U^««lv.sl^_,.,,>„^,^,^„^^^_^^_^.^ 



SUBSTITUTE SHEET 



wo 91/19813 PCr/lJS91/04078 

125 

equilibrium dissociation constants and the abundance of each RNA species. One reason for this 

level of accuracy is that cirois in [P.RNA] tend to cancel in Eq. (10) whenever {P] - [P.RNA] is 

greater than ATrf,. for example, when [RNA] is less than Kdj or when Kdi is less than <Kd>. 

Interestingly, this means that accuracy tends to be higher for any protcin-RNA specics-i complex 

with better binding than the bulk RNA pool Representative examples of the inidal accuracy of 

enrichment caiciUaiions-dcrmed as the increase in the fracuon of the total RNA pool composed of 

the best-binding RNA species in each round, and approximated by substituting Eq. (10) into 
£q.(30>- 

The overall 

accuracy :*ov.n is a reflccuon of the accuracy of the equilibrium concentrations calculawd for 
^--rryprotein-RNA species-/ complex using Eq. (10). In a subsequent section, we capitalize on 
this accuiacy to calculate optimum RNA and piotein concentmions for maximum enrichmenL 

Partitioning. Any method of partitioning different species of nucleic acid sequenccs- 
including filter binding (Jueric & Gold. 1990). gel-mobUity shifts (Blackwell & Weintraub. 1990). 
affinity chiomatogiaphy (EUingion A Szosudc. 1990; Cieen et al.. 1990; Oliphani & Sinihl. 1987; 
Olipham & Struhl. 1988). andbody precipitation, phase partitions, or protection from nucliolytic 
cleavage (Robertson & Joyce. 1990>-could be used to advantage with SELEX. For example, 
with fiher binding most protein-RNA complexes stick to a nitrocellulose filter while most free' 
RNA molecules wash through (Uhlenbeck ««/.. 1983; Varus. 1976; Varus & Berg. 1967; Varus 
& Berg. 1970). The actual fraction of proiein-RNA complex that sticks and then can be recovered 
from the filter is treated in the next section. 

Since a fraction of free RNA molecules also sticks to the filter as nonspecific background, 
the total amount of each RNA spedes-i collected on the filter is calculated using the following 
formula, which accounts for both the desired signai from the best-binding RNA molecules in 
protein-RNA complexes and the noise from free RNA molecules collected as nonspecific 
b.ickground plus competing RNA molecules in protein-RNA complexes: 



SUBSTITUTE SHEET 



^ PCr/US91/M078 

126 

where RNAf" is «hc number of molecules of RNA species./ coUected. Voi is ihe volume of the 
reaciion mixture passed th„».gh the filter. [PUWaA is the equUibrium concent«Uon of protcin- 
RNA species-i complex calculated as described in thepreceding section. BG is the fraction of f«e 
RNA collected as nonspecific background, and [RNA^] is the total RNA species-/ concentration 
Any method of partitioning typically gives less than perfect sepantcion of bound and unbound 
n.:^a„ahence.re.ui«sameasu«forthe^ 
bound Ugands in each round. 

ftomiheBtoijcatailiiieduaiigilBfoilowiiigft^^ 
(12) INAr'.FR.KNAP', 

' A ... /I, 

co.p>e, ^.clcs „ U„ fi^ ^ ^ ^ ^ ^ ,^ ^ ^^^^ ^ ^ ^ 
r.covc«d «l copM by „«„e o^^ipm. » cDNA for PCR. A«u,»i„g „ 
comum f„ .u specie, u . ,c«o.^ suni., p.to, si.« giv«, «tfncie„, u^. ^, 

n»l«ul« have *c s,« pri„„ PCR ^ „ <rf prin„r ou.^^ i. „^ 

^p.c^whc,h.,,.„crabun^^vi™.Hya„«„UlccUW.,.,^^^.^ 



SUBSTITUTE SHEET 



wo 91/19813 PCT/US91/04078 

127 

molecule. Also, since each RNA molecule is the same length, there is no differential rate of 
amplification on the basis of size. Of course, if any RNA species has a secondary siructure that 
interferes with primer annealing for cDNA synthesis, or if the primary or secondary structure of 
the corresponding cDNA slows the rate of DNA polymerase during PGR amplification, enrichment 
of that species is reduced. We do not incoiporaie these effects, since there are no good rules to 
predict what scnictures actually make a diffcience. When more is leanwd about these structures, 
any significant effects can be added to the maUtemadcal descripdon of SELEX. 

n,e total anKwnt of RNA recovered firom the filteris calculated by summing the number of 
molecules of each species collected to make cDNA cqpies for PCR amplificarion: 

(13) RNAf^^ L RNAr 

Any "earner" or "nonspecific compedtor" molecules should be excluded from the toul in Eq. (13). 
since without PCR primer sites these molecules do not aropUfy. Affinity measurement protocols 
often include these nonspecific competitor RNA molecules, and if such molecules also are used in 
SELEX. obviously Uiey should be nonamplifiaUe. Interestingly, whenever nonspecific compedtor 
molecules inieiact with die protein at die same site as the bcst-binding ligund molecules, the main 
consequence of adding compedtor molecules is a reduction in the number of specific sues available 
for selection. Hence, to determine the protein concentration that binds U>e desired amount of 
ampUfiable Ugand molecules with a high concentration of nonspecific compedtor molecules 
present, corrected binding curves must be generated by including die appropriate concentration of 
these molecules in each titration. Hic advantages of using a high concenmmon of nonspecific, 
nonamplifiable competitor molecules in each round ot' SELEX can include a reduction in 
adsorption of ampUfiable ligand molecules to any nonspecific sites on labware. a reduction in 
binding of ampUfiable ligand molecules to any nonspecific sites on the target protein, or a teduction 
in the fraction of free amplifiabie molecules collected as nonspecific background on "false- 



^B^TOTe SHEET 



wo 91/19813 PCr/US91/04078 

128 

paniiioning" siie*-bui only when such sites are present in significant numbers and are effectively 
saturated by the amount of nonspecific competitor molecules used. If Uiese conditions ate not met. 
the effect of adding nonspecific competitor molecules essentially is the same as reducing the 
amount of protein used. 

nie amount of each amplifiable RNA species-/ recovered after one round. relaUve to the 
total in £q. (13). is calculated as follows: 

(14) Fi'- ^^^ 

After PGR ampUficauon of cDNA copies andrenormali^iion of Uic RNApool back to its ori^™., 
concentration by in vUro transcripnon (from identical promoter siu« on aU cDNA molecules) the 
concentration of each RNA species after one round of SELEX is: 

(15) [/WAil-F/.[iWAl, 

where [RNA] is the total concentration of the RNA pool. For each additional round of SELEX. 
ihe concentration of every RNA species can be computed by reiteration of Eqs. PHIS), with f/ 
for each RNA species from one round being die starring fraction f/' in the next Isee Eq. (9)J. 



SUBSTITUTE SHEET 



129 



CIAIMS 

1. A method for identifying nucleic acid ligands from 
a candidate mixture of nucleic acids, said nucleic acid 
ligands being a ligand of a given target comprising: 

a) contacting the candidate mixture with the 
target, wherein nucleic acids having an increased 
affinity to the target relative to the candidate 
mixture may be partitioned from the remainder of the 
candidate mixture; 

b) partitioning the increased affinity 
nucleic acids from the remainder of the candidate 
mixture; and 

c) amplifying the increased affinity nucleic 
acids to yield a ligand-enriched mixture of nucleic 
acids • 

2. The method of claim 1 wherein said candidate 
mixture is contacted with said target under conditions 
favoreJ^le for binding, and nucleic acid-target pairs 
contain formed. 

3. The method of claim 1 further comprising the step: 

d) repeating steps a) through d) using the 
ligand enriched mixture of each successive repeat as 
many times as required to yield a desired level of 
increased ligand enrichment. 

4. The method of claim 3 wherein said level of 
increased ligand enrichment is sufficient to ested^lish 
a ligand solution to said target. 

5. The method of claim 1 wherein said target is a 
protein. 

6. The method of claim 5 wherein said protein is a 
nucleic acid binding protein. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCr/lS91/04078 



130 

7 . The method of claim 5 wherein said protein is not 
known to bind nucleic acids. 

8. The method of claim 1 wherein said amplification 
5 step employs polymerase chain reaction (PCR) . 

9. The method of claim l wherein said partitioning 
step es^loys filter binding selections. 

10 10. The method of claim 1 wherein said cemdidate 

mixture is coa^rised of ribonucleic acids. 

11. The method of claim 9 wherein said contacting step 
is performed in the presence of an excess of said 

15 desired taixret. 

12. ISie method of claim 1 wherein said target is 
st^ported on a matrix. 

20 13- The method of claim 12 wherein said matrix- 

supported target is held in a column. 

14. The method of claim l wherein said candidate 
mixture comprises nucleic acids each con^rising a 

25 segment of reuidomized sequence. 

15. The method of claim 1 wherein said candidate 
mixture is prepared by synthesis from a template 
containing conserved nucleotides and randomized or 

30 biased nucleotides. 



35 



16. The method of claim 1 wherein said candidate 
mixture comprises nucleic acids each comprising a 
segment of conserved sequences. 

17. The method of claim 16 wherein said conserved 
sequence segment of the nucleic acids of said candidate 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



131 

mixture comprises a nucleic acid sequence known to bind 
to said target. 

18. The method of claim 2 wherein said target is a 
5 protein and said nucleic acid*target pairs contain 

Michael adducts. 



19. A method for preparing a nucleic acid antibody 
comprising identifying a ligand solution according to 

10 claim 4, and cloning a nucleic acid derived from said 

solution, whereby a purified nucleic acid antibody is 
prepared. 

20. A method for selecting a nucleic acid which 
15 affects the function of a target molecule which 

comprises the additional step of screening nucleic acid 
molecules identified by the method of claim 1 for 
increased affinity to said target for their ability to 
affect the function of said target molecules. 

20 

21. The method of claim 4 further comprising the steps 
of: 

e) preparing a second candidate mixture 
comprising nucleic acids, each nucleic acid comprising 

25 randomized nucleotide segments and conseirved nucleotide 

segments, and said conserved nucleotide segment 
comprising nucleotides sequences derived from said 
ligand solution; and 

f) repeating steps a) through d) using said 
30 second candidate mixture. 



22. The method of claim 4 further comprising the steps 
of: 

e) preparing a second candidate mixture 
35 comprising nucleic acids, each nucleic acid contaxning 

biased nucleotides derived from said ligand solution; 
and 



SUBSTITUTE SHE^I 



132 

f ) repeating steps a) through d) using said 
second cemdidate mixture. 

23. The method of claim i wherein said target is a 
tremsition-state analog. 

24. A method of making a mixture of nucleic acids 
enriched for nucleic acid ligands of a given target 
coiDprising the steps: 

a) preparing a candidate mixture of nucleic 
acids comprising a conserved segment and a randomized 
segment; 

b) contacting the mixture of nucleic acids 
with the target under conditions favorable for binding, 
to form nucleic acid-target pairs and unbound nucleic ' 
acids; 

c) partitioning unbound nucleic acids and 
nucleic acid-target pairs to separate unbound nucleic 
acids from nucleic acid-target pairs; and 

d) amplification of the partitioned nucleic 
acids of the nucleic acid-target pairs to yield a 
ligand-enriched mixture of nucleic acids. 

25. The method of claim 24 wherein the randomized 
segment of said nucleic acids is chemically 
synthesized. 

26. The method of claim 24 wherein the randomized 
segment of said nucleic acids is synthesized by an 
enzyme catalyzed reaction. 

27. The method of claim 24 wherein the randomized 
segment of said nucleic acids is made by cleavage of a 
naturally-occuring nucleic acid. 

28. The method of claim 24 wherein the randomized 
segment of said nucleic acids is a contiguous string of 



SUBSTITUTE SHEEl 



133 

at least about 15 nucleotides. 

29. The method of claim 24 wherein the randomized 
segment of said nucleic acids is a contiguous string -of 
at least about 25 nucleotides. 

30. A nucleic acid candidate mixture comprised of 
nucleic acids, said nucleic acids comprised of a 
conserved nucleotide segment and a randomized 
nucleotide segment. 

31. The nucleic acid candidate mixture of claim 30 
wherein said randomized nucleotide segment is a 
contiguous string of at least about 15 nucleotides. 

32. The nucleic acid candidate mixture of claim 30 
wherein said randomized nucleotide segment is a 
contiguous string of at least about 25 nucleotides. 

33. The nucleic acid candidate mixture of claim 30 
wherein said mixture contains at least eU^out lo' 
nucleic acids. 

34. The nucleic acid candidate mixture of claim 30 
wherein said mixture contains at least about 10^^ 
nucleic acids. 

35. The nucleic acid candidate mixture of claim 30 
wherein said randomized nucleotide segment is flanked 
by conserved nucleotide segments. 

36. The nucleic acid candidate mixture of claim 35 
wherein said flanking conserved nucleotide segments are 
selected to aid in the amplification of selected 
nucleic acids. 

37. The nucleic acid candidate mixture of claim 30 



SUBSTITUTE SHEE1 



wo 91/19813 



PCrAJS91/04078 



134 

Wherein said conserved and randomized segmen-ts are 
eurranged to enhetnce the percentage of nucleic acids in 
said ccmdidate laixture having a given tertiary 
structure. 

38. The nucleic acid candidate mixture of claim 37 
enhanced in nucleic acids configured as hairpin loops. 

39. The nucleic acid candidate mixture of claim 37 
enhanced in nucleic acids configured as pseudoknots. 

40. A synthetic nucleic acid antibody having a 
specific binding affinity for a target, such target 
being a three dimensional chemical structure. 

41. The nucleic acid antibody of claim 40 comprising a 
nucleic acid ligand selected from the group consisting 
of single-stranded PNA^ double-stranded RNA^ single- 
stranded DNA, double-stranded DNA and chemical 
modifications thereof. 

42. The nucleic acid antibody of claim 40, said target 
being a structure other than a polynucleotide which 
binds to said nucleic acid antibody through a mechanism 
which predominantly depends on Watson/Crick base 
pairing or triple helix binding; provided, however, 
that when the nucleic acid antibody is double stremded 
DNA, the target is not a naturally occuring protein 
whose physiological f\inction depends on specific 
binding to doiible stranded DNA. 

43. The nucleic acid antibody of claim 40 wherein the 
target is a protein. 

44. The nucleic acid antibody of claim 40 wherein the 
antibody is comprised of single stranded RNA. 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



135 

45. The nucleic acid antibody of claim 40 comprised of 
a plurality of target-specific ligands. 

46. The nucleic acid antibody of claim 45 comprised of 
5 more than one identical ligands. 

47. The nucleic acid antibody of claim 45 comprised of 
at least two distinct target-specific ligands. 

10 48. The nucleic acid antibody of claim 40 further 

comprised of a non-nucleic acid element. 

49. The nucleic acid antibody of claim 48 wherein said 
non-nucleic acid element is capable of directing said 

15 antibody to a selected, location in the body of a 

patient. 

50. The nucleic acid antibody of claim 48 wherein said 
non-nucleic acid element is a target-specific ligand. 

20 

51. The nucleic acid antibody of claim 40 wherein said 
target is a protein and said antibody is capable of 
forming a Michael adduct with said target. 

25 52. The nucleic acid antibody of claim 40 comprised of 

a nucleic acid ligand obtained by the process of claim 
1. 

53. The nucleic acid antibody of claim 52 wherein said 
30 target is a protein. 

54. The nucleic acid antibody of claim 52 wherein said 
nucleic acid ligand is selected from the group 
consisting of single-stranded RNA, double-stranded RNA, 

35 single-stranded DNA, double-stranded DNA and 

modifications thereof. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCT/l^l/04078 



136 

55. The nucleic acid antibody of claim 43 wherein said 
protein is a nucleic acid polynerase. 

56. The nucleic acid antibody of claim 55 wherein said 
protein is a DMA polymerase. 

57. The nucleic acid antibody of claim 56 wherein said 
protein is pg 43. 



10 



15 



20 



25 



30 



35 



58. The nucleic acid of claim 57 comprising the RNA 
sequence 

5 • -NNNGAGCCOAGCAACCUGGGCUAGGAAU-3 • 
or the corresponding DNA sequence thereof or 
complementary sequences thereof. 

59. The nucleic acid antibody of claim 55 wherein said 
protein is a reverse transcriptase. 

60. The nucleic acid antibody of claim 59 wherein said 
reverse transcriptase is Hiv-i reverse transcriptase. 

61. The nucleic acid antibody of claim 60 wherein said 
antibody is configured as a pseudoknot. 

62. The nucleic acid antibody of claim 61 comprising 
the RNA sec(uence 5 ' -UUCCG-3 • . 

63. The nucleic acid an tibody of claim 61 comprising 
the HNA sequence 




40 



U ANAAYYYYYY 
or the corresponding DNA sequences thereof or 
complementary sequence thereof. 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/US91/04078 



10 



137 



64. The nucleic acid antibody of claim 61 comprising 
the RNA sequence 5 * -CGG6A-3 * . 

65. The nucleic acid antibody of claim 61 wherein loop 

1 is two nucleotides long. 

66. The nucleic acid antibody of claim 61 wherein Stem 

2 is 5 or 6 base pairs. 

67. The nucleic acid antibody of claim 61 wherein if 
Loop 2 exists, the nucleotides are As. 

68. The nucleic acid antibody of claim 61 wherein Loop 
15 3 comprises at least 3 nucleotides, said nucleotides 

enriched in A. 

69. The nucleic acid antibody of claim 61 wherein Loop 
1, Stem 2(a) and Loop 2 consist of eight nucleotides. 

20 

70. The nucleic acid antibody of claim 43 wherein said 
protein is a bacteriophage coat protein. 

71. The nucleic acid antibody of claim 70 wherein said 
25 protein is bacteriophage R17 coat protein. 

72. The nucleic acid antibody of claim 71 comprising 
the RNA sec[uence 

U C 
A A 
CG 
CG 

A 

CG 
NN' 



30 



35 



or the corresponding DNA sequence thereof or 
40 complementary sequences thereof. 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



138 



73. The nucleic acid antibody of claim 43 wherein said 
pro-bein is a serine protease. 

74. The nucleic acid antibody of claim 73 wherein said 
protein is tPA. 



10 



15 



20 



25 



75. The nucleic acid antibody of claim 43 wherein said 
protein is a mammalian receptor. 

76. The nucleic acid antibody of claim 75 wherein said 
protein is humem growth factor. 

77. The nucleic acid antibody of claim 43 wherein said 
protein is a maunmalian hormone or factor. 

78. The nucleic acid antibody of claim 77 wherein said 
protein is insulin. 

79. The nucleic acid antibody of claim 43 wherein said 
protein is not Joiown to bind nucleic acids. 

80. The nucleic acid antibody of claim 77 wherein said 
protein is Nerve Growth Factor. 



81. The nucleic acid antibody of claim 80 comprising a 
RNA sequence selected from the group consisting of : 

5 • -CUCA-3 • ; 

5 ' -GAGCGCAAGACGAAUAG-3 • ; 

5 • -UACA-3 • ; and 
30 5 • -ACa«JCGAUGACC66AAU6CCGCACACAGAG-3 • 

or the corresponding DNA sequence thereof or 
complementeury sequences thereof. 

82. The nucleic acid antibody of claim 56 wherein said 
35 protein is HSV-i DNA polymerase. 

83. The nucleic acid antibody of claim 82 comprising 



SUBSTITUTE SHEEl 



wo 91/19813 PCrAJS91/a4078 

139 

the RNA sequence: 

5 • -UAAGGAGGCCAC-3 ■ 
or t:he corresponding DNA sequence -thereof or 
complementary sequences thereof. 

5 

84. The nucleic acid antibody of claim 43 wherein said 
protein is a ribosomal protein. 

85. The nucleic acid antibody of claim 84 wherein said 
10 protein is E. coll ribosomal protein SI. 

86. The nucleic aic antibody of claim 78 comprising 
the RNA sec[uence: 

R N G 

IS ^^_^Y G 

(C/G) A 
(U/A) A 
X C 

20 // / / / N-N' 

A-U 
A-U 



25 




G-C 

5'— NNYR(G/C) {A/U)GACAC-GNN— 3' 

or the corresponding DNA sequence or complementary 
sequences thereof. 



87. The nucleic acid antibody of claim 43 wherein said 
30 protein is a viral rev protein. 

88. The nucleic acid antibody of claim 87 wherein said 
protein is HIV-l rev protein. 

35 89. The nucleic acid antibody of claim 88 configured 

as an assymetric bulge. 

90. The nucleic acid antibody of claim 89 comprising a 
RNA sequences selected from the group consisting of: 

40 



SUBSTITUTE SHEE1 



wo 91/19813 



PCr/US91/04078 



140 



G-A-G-A 



°7// 



N 



^ ' "'-^'"A— A— X— X III C— A— Y— # 

3 '~Y-Y-y-y-Y-.C-U-C-A-G-G-U-Y— 5 ' 



10 



6 A 
H N 
N A 
5 ' ~X-X-X-X-6 C-A-G---? / 

3'-^Y-y-y-y.C-A-C-A-G-G-2-C-5' 



and 



15 U-U-G-A-U-C-U-A 

A — - - 

A 



I cic //// ^-G-A-A-3' 
n-n yr/y / G-C-U-U~5' 
U-D C-U-A-G-U-A-A 

or the corresponding DNA sequences thereof or 
complementary sequences thereof. 



SUBSTITUTE SHEEl 



wo 91/19813 



PCrAJS91/04078 



I/5H 



O 
I 



o • (U 

O. 

P 



U D 



o 

CM 
I 



D f< 



CD CD C9 a D 
I I I I I 

. D u o a i< 

^ CD 



o 
cn 

I 



o 
I 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



1 



rrEklPLATE 

CONSTRUCTION 



a) 



.iT7 



® 



b.)| 



(2) 



noosa:! 



4 

ligation 



Ijggti 

OtiOCh 



® 



" — "4 — S5 — 
ligation^ 



; n 



; variable 
« •» sequence 




^ ^ 1^3' primer J 

&nition^-^°"2??";"9 
seoJeriee ^"^ 




d'>cDNA synthesis of selected 
RNAs eluted from filters 




f.)in vitro transcripTion to begin theynext 
I round of selex 



selection I 




FIG.2 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 




i 



i 



I 



UGCA UGCA UGCA 



SUBSTITUTE SHEET 



wo 91/19813 



PCT/VS91/04ff7S 



FIG.4 



EXPERIMENT: 






i 



I : -TS 



Variable m Tmm 

Keguni 



A. 

• i 



t' : 

i * 



■ 



A 

A/0 

u/c 

A 
A 

C 

U/C 

u/c 



i 



U6CA UGCA U6CA 



SUBSTITUTE SHEET 



PCr/US91/04078 



J 



1 




I 



1 1 



11 




• tt 

I 



II I 





il t 

t f ' 



I II 



II 

I HI 

I i I 



D 



D 




to 

D 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 




SUBSTSTUTE SHEET 



wo 91/19813 



PCr/lS91/04078 



N 
N 



N N 



N N 



N 
N 



U 
C 
C 
G 



G 
G 
G 
C 



• . .GA UA. • . 



Kd = 3.2 X 10 



-7 



SBLEX 



V 



u/c 

A/G 
A 




U 

c 
c 

G 



A 

C 
U/C 

u/c 

G 
G 
G 
C 



. . .GA UA. . . 



(1/20) 
Kd = 2.7 X 10"® 



U C 

A [Cl— U(l/20) 

^ G Kd « 1.2 X 10"® 
C G 
C G 
G C 
. .GA UA. . 

Wild Type 
(9/20) 

t Kd 



AJ A 



C 
U 
G 
G 
G 
C 



G 
A 
U 
C 
C 
G 



►G (1/20) 
Kd = 
1.7 X 10" 



4.8 X 10 



-9 



. .GA UA. . 

Major Variant 

(8/20) 
i 



FIG.7 



SUBSTITUTE SHEET 



wo 91/19813 PCr/US91/04078 



VARIABLE TEMPLATE SYNTHESIS USING TERMINAL TRANSFERASE 
5' PRIMER 

(OR PRIMARY LI6AND SEQUENCE) 3' primeR 

5' 3' 3' 5' 

I 

TAIUNG WITH TERMINAL TRANSFERASE 

USING RANDOM NUCLEOTIDES- dNTPs 

5' • ^^)~^ 

HOMOPOLYMER TAILING OF LENGTHENED 5' j^td 
PRIMER a 3' PRIMER. J^^^^ 

^ - CCCCC^^^ 

GGGGG6- 

3 PRIMER 

ANNEALLING a FILLiN 

5' CCCCCC 

GGGGGG 




FIG .8 
SUBSTITUTE SHEET 



wo 91/19813 PCrAJS91/04078 

''WALKING' BY EXTENDING THE PRIMARY LIGAND. 
SECONDARY 




FURTHER SELEX TO ISOLATE HIGHER 
AFFINITY LIGANDS 

I 




FIG. 9 



SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



ANCHORING OF BRIDGING OLIGONUCLEOTIDE a 
SECONDARY LI G AND EVOLU Tl ON . 



PROTEIN OF 
INTEREST 



INHIBITORI 




ATTACHED 
^ GUIDE 
OLIGO- 
NUCLEOTIDE 



OONNECHNG 
LINKER 



GUIDE OLIGONUCLEOTIDE 
ANCHORED AT TARGET SITE 



PLUS STRAND TO BE USED IN SELEX 
^#5* PRIMER 



VARIABLE 
REGION 



GUIDE OLIGONUCLEOTIDE 



i5T 



3' PRIMER 



I 



SELEX LEADS TO 
ISOLATION OF LIGANDS 
THAT INTERACT WITH 
SECONDARY BINDING 
SITES 



SECONDARY 
BINDING SITE 




PROTEIN 




OF 




INTEREST 





FIG. 10 
soBsnruTE sheet 



wo 91/19813 



PCrA»91/04078 



SECO^JDARY LI 6AND- DIRECTED PRIMARY LIGAND EVOLUTION. 




INSERTION OF VARIABLE SEQUENCE 
AT GUIDE SITE 



FURTHER SELEX TO EVOLVE LIGANDS TO 
PRIMARY TARGET DOMAIN 




FfG.II 
SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS91/04078 



CO 
CM 



■M (0 ^ m 
m 4J o 

4J CO 



(0 -H 



m 4J 



ID 4-» 
4J IT) 

M U 0> 



4-» O CJ> 
O +J JTJ 

6 o CP 

O <0 4J 

M CP O 
Q. O CP 

I 4^03 

I I 

in m 



2 -u 5» 

^ CO rO 

2 1. CP 

<0 tn xJ ri 

m i ^ ? ^ 

m 4-) ▼S^, I p 

CP O ^S^ ^ Q 

j3 



(0 
CM 



J-» <0 <0 < 
2 ^ cn cvi 



ro 4J y 



" 10 

fO 

a 
<o 



«0 4-» 

^ ^ tn 



CP O X 

CP a 2 



fO 

Id 
a 

•H p 

a> «o 4J o m 



w CP 
C (0 

(0 o 
M P 

4-1 (0 

u 

O CP 

}^ (0 

4J CP 

•H CP 

> o> 
I 

c 

•H in 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/l»91/04078 



I 

o 
u 

CP 

o 
o 

4-J 



< 




C 
CM 

m 



no 
I 

CP 
CP 

o 
o 

(IS 
CP 
CP 

u 

rO 

CO 

03 

CP 

P 

P 

(0 

p 
o 
p 
o 
p 

p 

P 
P 



CO 
CO 



t 



< 

CD 



SUBSTITUTE SHEET 




in 



^BSnrUTE SHEEf 



wo 91/19813 



PCr/US91/04078 



FIG.I4 

100 1 
90- 




[HIV-RT] log^o^"^) 



A I . I UCQOgAAUUCCGUUUUCAGUCGGGAAAAACUGAACA QUCU ( 13 ) 



//// 



O 1.3 UCaagAAUAUCUUCCGAAGCCGAACGGGAAAACCGGCauCU (I) 
• 1-3 G A (I) 

□ 1.4 UCQggGGCAUCUGGGA GGGUAApGGUAAp GUUGUCGGauCU (4) 



SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 



100 
90- 
80- 
70- 
60- 
50« 
40- 
30- 
20- 
10 
0 



•2.0 



FIG. 15 



-1.0 



"So" 



[HIV-RT] log^o(nM) 



' I 

1.0 



—I 

2.0 



ISOLATE 

O 2.1 a UCaag--AAUAUA-iJCCGMCUCGAC5GgAUAACGA6AA-G0Uai t3) 
□ 2.2b UCaagUACCUA6GUGAUAAAA6GGA6AACACGUGUGa-CU (13) 
• 2.5 b UCaagACAGlWU^UUCUUGAUCAU^GGGACAAAUGau^ (3) 

A I.I UCaCQAAlJUC^GUUUUCAGUCGGGAAAAACUGAACAAUCU (13) 



SUBSTITUTE SHEET 



wo 91/19813 PCrAJS91/04078 




FIG. 16 A 
SUBSTITUTE SHEET 




FIG.I6B 

SUBSTITUTE SHEET 



wo 91/19813 



PCrAJS9]/04078 




- • f if 

<ttn • I 




soBsnTO-re sheet 



wo 91/19813 



PCr/US91/04078 




BUBSTITI ITE SHEET 



wo 91/19813 



PCrAJS91/fM078 




3TITUTE SHEET 



wo 91/19813 PCr/US91/04078 



FIG.I9 



-6 


. ^ 








A 


C 


G 


U 






/ 


U 


c \ 






-4 


36 


0 


0 


0 






• 

• 
• 
■ 
• 






*. 

\ 






-5 


0 


36 


0 


0 






1 A 




A ^ 

/ 






-6 


4 


3 


1 


28 






mm 






-4 




-7 


36 


0 


0 


0 






-10 <A ^ 


G 






10 


36 


0 


0 


0 
























G 


|-2 




AU 


CG 


\ 


I GC 


UG 


GU 




\ END 


-ii- 


|C 


G 


[-1 




0 


24 


0 


12 


0 


0 










•9/-2 


0 


25 


0 


10 


1 


0 


0 


0 


-12 


In 

IN 


N' 
N' 


1+1 


-11/-1 


0 


24 


2 


10 


0 


0 


36 


0 


-13 




-12/+1 


8 


1 


8 


10 


7 


1 


0 


1 


[•f2 


-13/+2 


6 


5 


8 


9 


3 


1 


3 


3 


-14 


|Pu Py'i 


+3 


-14/+3 


9 


0 


4 


10 


2 


3 


3 


4 




-15/+4 


4 


0 


9 


6 


0 


1 


6 


8 


-15 i 


N 


N'l 


+4 


-16/+5 


10 


1 


2 


1 


1 


3 ■ 


0 


2 


-16 1 


N= 


:N'i 




-17/+6 


0 


4 


6 


1 


4 


2 


1 


1 


-17 1 

5' 


N'l 

3' 


►6 





SUBSTETUTE SHEET 



wo 91/19813 



PCT/IJS91/04«I78 



FIG.20 




Bradykinin (M) 



SUBSTITUTE SHEET 




SUBSTSTUTu SHEET 




SUBSTITUTcr SHEZT 



wo 91/19813 PCr/US91/04078 



U g 

o o 
o o 

< ED 

^ o o 

< < 

I 



3 



r:> < 
< p 



4J • • 



fn to 



cd 



•H 

O 

s 









D 






^§ 










O 




1 




s 




D 




O 


D 


1 


D 


1 


<: D 


o o 




O D 












O O 









c 



ID m 



O 
S 



m lo 
I I 
cd o 

CO O 
P CD 



g 



I 

p n> 
o o 

2 < 

CO t3 

o o 

2 « 
o I 

CD O 

o o 
o o 



o 
o 

0) 
M 
-H 
iH 

I 



-H 

O 
S 



< CD 

o o 

o o 

< I 

CD O 
o < 
O CD 
CD U 

a CD 
< o 

CD O 

D 



ro 

C\J 
Ll- 



SliBSTSTUTE SHEET 



wo 91/19813 



PCT/IJS91/04078 




wo 91/19813 



t 

PCr/US91/04078 




SUBSTITUTE SHEET 



wo 91/19813 



PCr/US91/04078 




wo 91/19813 



PCT/US91/04078 



q 




punoe VNH % 



SUBSTITUTE SHEET 




SUBSTITUTE SHEET 



wo 91/19813 



1^ 

PCr/US91/04078 




wo 91/19813 



PCTAJS91/04078 




SUBSTITUTE SHEET 



wo 91/19813 



PCr/l)S91/04078 



CO 



CO 



to 




(0)2UieiS 



(B)ZUI9|g 



hi 

o. 



3 
E 



in 



SUBSTITUTE SHEET 



INTERNATIONAL SEARCH REPORT 

International ApplicMion No. PCT/US9 1/04078 



t. CLASSiFICATtON OP SUBJECT MATTER (il several classificalion symbols apply, tndieate all) • 


According to International Patent Classification (IPC) or lo both National Classincation and IPC 

IPC(5):C12Q 1/68;C12P 19/34;G01N 33/48, 33/566 ;C07H 15/12 
U.S. a.: 435/6.91: 436/94, 501; 536/26, 27. 28. 29 


II. FIELDS SEARCHED 


Minimum Documentation Searched ? 


Classification System 


Classification Symbols 


U.S. 


CL.: 


435/6,91; 436/94,501; 536/26,27, 28, 29; 935/77,78 


DQCumentatton Searched other than Minimum Documentation 
to the Eitent that such Documents are tnetuded in the Fields Searched ■ 


Computer Search: Dialog one search 


III. DOCUMENTS CONSIDERED TO BE RELEVANT • 


Category • 


Citation of Document. " with indication, where appropriate, of the relevant passages « 


Relevant to Claim No. *> 


Y 


Belfort et al,. eds* "RNA: Catalysis, 
Splicing. Evolution" published 1989 by 
Elsevier (Amsterdajfi) , see pages 83-87. 


1-39 


y 


Biochemistry. Vol. 26. No. 6. issued 
1987. Romaniule et al., "RNA Binding Site 
of R17 Coat Protein", pages 1563-1568.. 
see entire paper. 


1-39 


Y 


Biochemistry, vol. 22* no. 11. issued 
1983. Carey et al.. "Sequence Specific 
Tnterraction of R17 Coat Protein 
with its Ribonucleic Acid Binding Site". 
pages 2601-2610. See page 2603. column 1. 


1-39 


Y 


Nucleic Acids Research. Vol. 17. No. 2, 
issued 1989. Joyce et al.. "A Novel 
Technique for the Rapid Producing of 
Mutant RNAs", pages 711-722.. see entire 
paper. 


1-39 


• SpMial eatagerias ol eitad tfocucnentt: » "T" later documant publlahad attar the intarnatienal filing date 

-A- doeom«,t deflntna mo B«,,r.l ««. of th. art which .. ool ?.'i.''d"rj„tl%fift oll P-SS^^ SS 

considered to be of particular relevance invwitlpn 
**£* earlier document but published on or after the inttmaUonal a.^- document of particular relevance: the claimed invention 

fiUno date cannot be considered novel or cannot be censldored to 
"L** document which may throw doubts on priority clalm<s) or involve an inventive step 

which IS cited to estabttah the publication date of another -y,. document of particular relevance: the claimed Invention 

citation or other special reason (as specified) cannot be considered to involve an inventive step when the 
*0" document relemng to an oral disclosure, use, exhibiUon or document is combined with ono or more other such docu- 

other means ments. such combination being obvious to a parson skilled 

»P* document published prior to the international filing date but "* ■'^ 

later than the pnonty date claimed document member of the same patent family 


IV. CERTIFICATION 


Date of tha 


Actual CompleUon of the International Search Date of Mallino of this Intemational Search Report 

08 October 1991 : 2 1 OCT 1991 


International Searching Authority Signature of Authorised om^^^^^^^ 

ISA/US Stephanie W. Zitomer Ph.D. ebv 



Fomi PCTASAO10 (Moond shssQ (Rav.l1-a7) 



intematioiuil Application No. PCT/US9 1/04078 



FURTHER IMFORMATIOH CONTINUED FROM THE SECOND SHEET 



V.P OBSERVATtONS WHERE CERTAIN CLAIMS WERE FOUND UNSEARCHABLE » 

This international search report has not been estabfished in respect of certain claims under Article 17(2) (a) for the following reasons: 
1.Q Claim numbers . because they relate to subset matter net required to be searched bf this Authority, namely: 



2.Q Claim numbers . because they relate to parts of the international application that do not comply with the prescribed require* 

ments to such an extent that no meanbigful internaUonai sparch can be cirrled out i'. specifically: 



3.n Ct^numbers . bec««#*ay«p»dap«KJeritdaiminoidrahidin«ccordaricewiihlhes 

POT Rule 6.4<a). 

Vi.|S OBSERVATIONS WHERE UNITY OP INVENTION 1 8 LACKING* 

. This International Searching Authority found mulUpie Inventions in this International appKcaUon as follows: 

I- Claims 1-23, 30-39 comprise a first product and process of using. 

II- Claims 24-29 comprise a second process. 
Ill" Claims 40-90 comprise a second product. 

1.Q As aU required additional search fees were timely paid toy the applicant, this international search report covers all searchable dabns 

of the IrUematioruil application. 

zD As only some of the required addUional search fees were tlmefy paid by the appBcant, this intemallonal seareii report covers only 
those daims of the international application for which lees were paid, spscificany dalms: 

3-D No required addttjonai search fees were timely paid by the applicant Consequently, this international search report is restricted to 
the invention first mentioned in the claims; it is covered by claim numbers: 



^ '^•mile wyiSflt'^^^^ justifying an additional fee. the International Searching Authority did not 

Remark on Protest 

O The additional search fees were accompanied by appiicant*s protest 

D No protest accompanied the payment of additional aaarch fees. 



FomiPCniSAatO (R^pHmm ttaet C9 (Rev. t VfT) 



lnl«mal»n.l Applica«on No. pc^/USQ 1/04078 



III. DOCUMENTS CONSIDERED TO BE RELEVANT (CONTINUED FROM THE SECONO SHEET) 


Category * 


1 Citation of Document, with indication, where appropriate, of the relevant passages 


1 Relevant to Oaim No 


V 


Proceedings* National Academy of Science. 
USA, Vol. ^63. issued 1969. Cohen et al.. 
"Tnteractdonc of Hor'^.-^r.^l Sts^ri^dF with Nucleic 
Acids, i. A .-pi^riflc Kcquirenient for Guanine", 
pages 458-464, see abstract. 


40-90 



tam PC17BM2tO («nalMi| (Riv.11-«7) 




THIS PAGE BLANK ojspto) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the appHcant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ ¥AX(ED TEXT OR DRAWING 



□ blurred OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: — 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 




THIS PAGE BLANK msFTO) 



